Tag Archive for 'PHP'

I’ve never had to scale

No, I’m not talking about my sex life, or anything like that. :) It’s just this: I’ve never had a site in the past that had too much success for its own good, and that, therefore, had scalability problems. Each one of my sites has either used some popular, usually well optimized software (say, WordPress or MyBB), or was mostly a bunch of static HTML pages. Neither of which, I believe, have “scalability “problems; the Internet connection or the web server itself (due to the number of simultaneous requests, not really related to what the app does) will complain long before “scalability” enters into it. And, sorry to say, except for a bunch of occasional Digg / Shoutwire / Stumbleupon / Reddit effects, none of my sites was ever truly “stressed”.

That, I hope, is about to change.

For my next project (which is about 75% complete) is a site that may well have scalability problems. Which is good, because I’ll learn about them, and how to cope with them.

What’s that project about? It’s a surprise. :) Suffice to say that, as far as I know, there’s only another one out there, and, weeks ago, it had to shut down its “free” version because it couldn’t deal with its success. On its first days it was quick, then soon it changed into “don’t wait; we’ll email you when the report is ready” mode, and finally the free version went under.

Now, I’m not a full-blown company, I’m “just zis guy, you know”, and I don’t believe I’ll have as much success as that one has / had. But… there is a demand; what happened to it is proof of that. And it’s quite possible that my code won’t scale.

I kind of hope it doesn’t. :)

Adventures with my Technorati ranks "toy"

As I mentioned here before, a couple of days ago I coded a program to take an OPML file and generate a table in which the sites listed on that file appear ordered by Technorati ranks. It also shows the number of incoming links (again, from Technorati), and each site’s PageRank.

(By the way: no, this is not ready for release yet. But it will be. Soon.)

Initially, the data collecting part of my program started by clearing a table in a MySQL database, which would then be filled with the values it would get from Technorati and Google. However, this had two problems:

  1. Technorati allows only a limited number of accesses per day. I discovered it when I was making several tests, and, after about half a dozen or so, it stopped giving me data. The problem, then, was that it had already cleared the table… so I ended up with an empty one.
  2. From time to time, Technorati gives me “wrong” ranks / links for a blog - values much lower (but not absurd / “bogus”, just wrong) than what they should be. It’s weird, and not reproducible, and usually, by asking TR again, the correct value is then returned.

To solve the first problem, obviously, some form of keeping the data from the previous run while getting the new values was in order, so that, if Technorati told me to get stuffed, I would still have the data from the day before.

The second problem was a little more complicated, though, in a way, the solution to the first helped me crack it.

My method was this: when running the script, start by copying the original table to another (let’s call it temp1) and clearing the original table. Then get the new data to yet another table (temp2). Afterwards, regenerate the original table with data from temp1 and temp2, the following way:

  • if an entry (identified by the site’s URL) exists in only one of the tables, use it.
  • if an entry exists in both, use the common values (URL, site’s name), and for the 3 numeric values, choose the best value (from the two tables) for each. “Best” means the highest # of incoming links, the highest PageRank, and the lowest Technorati rank.

This way, if once in a while Technorati gives it a much worse value than it should (I’ve never seen it rate a blog better than the reality), it still has a more correct value to use instead.

Sounds fine, doesn’t it? But there’s a problem with this method… which I solved later, but which I’ll discuss the next post. Until then… any guesses as to what it was? :)

It’s official: I’m on the job market again. :)

As of now, I’m looking for a job in the area of Lisbon, Portugal. Sorry to any non-Portuguese readers / potential employers, but I am not ready to move abroad at this time of my life.

My ideal job, at the moment, would be a senior sysadmin / junior PHP programmer “hybrid”, but I’m open to alternatives.

As I’ve said here before: no outsourcing, no MS stuff, no helpdesk. I don’t see this as “arrogance” on my part, but simply as not wanting to waste both sides’ time.

For more details (in Portuguese), and the full CV, please visit www.pedrotimoteo.com/cv . Thank you. :)

My Technorati ranks "toy"

Inspired by Carlos Andrade’s own tool, I’ve just coded a couple of scripts to take an OPML file and show an ordered table of Technorati ranks. Naturally, I used it for my own Planet site, Planet Atheism.

Here it is: Technorati Ranks for Planet Atheism members

The implementation was ridiculously simple (and there’s a lot of room for improvement), but, other than Carlos’ tool, I didn’t find any scripts or utilities to do this. And, yes, I searched. Therefore I may release the code soon, as the 2nd project on software.dehumanizer.com, since this can be a fun “toy”. :)

[EDIT: added each blog's Google PageRank to the table. Why not? :) ]

Announcing DailyTasks 0.1

A few minutes ago, I submitted my first piece of software to Freshmeat (it hasn’t been approved yet; it will probably take a few hours): DailyTasks. It’s a small utility, written in PHP, with both a command line mode and a web interface, which, surprisingly enough, reminds you of daily tasks. :)

The web page linked above tells the “story” in more detail, but, basically, I’m much too chaotic to use traditional task management programs (every time I tried, I seemed to spend more time updating tasks than actually doing them), but I wanted something to remind me, every day, of doing something — from “clean up GMail’s spam folder” through “update a blog” to “do the laundry, if necessary”. :) There was already a similar program (frequent-task-reminder), but it lacked some features that I wanted (such as non-accumulating tasks), and so I wrote my own.

It’s really basic stuff, with no bells and whistles, and the PHP code would probably scare you, so impressionable young people should avoid looking at it. :) But maybe — just maybe — you’ll find it useful.




Creative Commons Attribution-NonCommercial-NoDerivs 2.5 Portugal
Creative Commons Attribution-NonCommercial-NoDerivs 2.5 Portugal