30 Minutes with Theo Schlossnagle

      3 Comments on 30 Minutes with Theo Schlossnagle

p. the_schlossnagle This is another in a series of interviews I’ve been conducting as I travel to different conferences. Most of the people I get the chance to interview are directly involved with PHP in some way. This interview is a bit different. Theo Schlossnagle is the founder of OmniTI, one of the premier PHP consulting companies in the world. OmniTI consults with companies all around the globe on issues ranging from architecture to security.

p. Theo himself is not a PHP programmer. As a matter of fact, he’s not a big fan of PHP or web scripting for that matter. However, he does know a thing or two about hardware architecture, application design and scalability. He was kind enough to sit down with me for thirty minutes and talk about these topics and more.

p. **Can you give us a little background on OmniTI? Why did you start it?**
OmniTI was started in 1996. We started out doing mostly ad-hoc consulting for technology companies that were suffering from the dot-com explosion. Most companies back then built sites but didn’t have the fore sight to build them in a fashion that would work well as the community that they were servicing grew. People started coming to us with sites that worked great with five thousand or ten thousand people but then their business grew and they were servicing one hundred thousand people and the site fell apart. So we did a lot of performance tuning and scalability analysis at that point. From that point we grew it into a full-fledged business.

p. Starting in 2003 or 2003 we really adopted the entire business stack. Now we will take people’s business functional requirements, evaluate them and determine how good they are. We then turn those into actual technical functional requirements. We will then take that all the way down the stack to the operating system level. We will even build out the data centers when needed.

p. **That’s all very interesting. Let’s move into a different area now and talking about how OmniTI finds and retains programmers. How do you find good talent?**
Finding talent is probably the hardest thing to do in the industry these days. One of the things that Open Source exposed to the world was that in the software industry, like any industry, there is utter incompetence around every corner. There are good people, there are great people, and then there are people who can talk the talk but don’t walk the walk at all. So I would imagine it’s just as difficult to find a really talented carpenter, the challenges are the same. You really have to see their work. You have to talk to them and understand their ambitions and you have to get a feel for how much they really want to own what they do.

p. One of the greatest problems we have is finding people that are good but that also are interested in taking ownership of what they produce. We don’t do “drive by” consulting at OmniTI. We have clients that have been with us for eight years now. That means that the things we built eight years ago, we still maintain today. So it’s extremely important to us that you take pride in your work and you do a good job because you are going to be stuck with it for a really long time. So that’s a challenge.

p. One of the nice things about OmniTI is that it kind of works like a law practice in some ways. We have different principals at OmniTI. Each principal has their own team that they manage. We cross over those boarders when we do hiring interviews, we do pretty wide interviews. We will have three or four people interview the same person. It ends up being up to the principal whose team the person works on as to whether they get hired or not.

p. **It has to make your hiring a little easier since you have so many of the big names in PHP working for you already.**
Yes it does and it’s interesting. It is true that we have more PHP talent than just about any company out there but that wasn’t a business initiative or strategy. There were just a lot of smart people who were hard working and extremely dedicated and the happened to be in the PHP community. They don’t do just PHP work at OmniTI. Several of them work on the Message Systems product line

p. **Let’s talk a bit about Message Systems. I know email isn’t really PHP but it is a major part of what you do.**
Back in 1999 we had a professional services client who was facing a lot of email challenges. They were sending out confirmation emails for services. Their email demands at the time placed them at the second largest email center in the works. They were trying to build their system off of open source solutions and we built that for them and we also, of course, had to maintain it and that’s where masochism comes in. The company was a lot smaller then and I spent countless nights up at 3 AM just trying to get things to work. The solutions we were working with just didn’t work. They weren’t designed for that kind of thing. The uses and abuses of email have all evolved a lot since 1980 but the MTA architectures have not. The feature sets have evolved a little but the underlying architectures have not. It just did not fit what they were trying to do. So we set out to build a new architecture for them, something that is much more applicable to the demands of today and tomorrow.

p. **So Message Systems is different from Exim or Sendmail in what way?**
Exim, Sendmail, PostFix, QMail and others like them are a hybrid between a Mail Transport Agent (MTA) and a Mail Delivery Agent. (MDA) When mail comes into your system it is received by an MTA. The MTA routes the mail, it decides where it’s going to go. If you send an email to a contact, your mail client submits it to your MTA. Your MTA will route it to some other MTA on the Internet. That’s the MTA’s purpose. If it decides that it needs to store that mail for you locally, typically it hands it to an MDA. The MDA is what actually stores the message on disk. Exim and other open source projects have very simplistic MDA features built into them so that they can write stuff into your POP folder or whatever. Message Systems has that too but we really try to focus on the MTA part not the MDA part.

p. The thing our architecture really allows you to do is apply business logic as the message is passing through on the transport level. So when you get a TCP connection from a remote host and they start their conversation to send you an email, you can start taking actions on that in languages like PHP or Perl or Java. We also have our extension language called Sieve that glues them all together. SIEV is a fairly obtuse language but it allows you to do things like look at headers, match the FROM and TO, things like that. Our version is called SIEV++. We’ve adapted it to run before the message is actually received. This allows us to run rules and we can choose not to accept the message.

p. **So what is Message Systems’ main competition?**
Well it directly replaced solutions like Exim and Sendmail. I’m actually a big fan of Exim. I was one of the early adopters and big pushers of Exim. The original architecture we build was based on Exim. We had 16 machines running 24×7 all running Exim and sending somewhere near one hundred and fifty thousand to two hundred thousand messages per hour, per machine, which was a lot in 1996. If you turn around and look at your web server and say “My web server serves one hundred and fifty thousand requests per hour”, the response is usually something like “is it idle?” one hundred and fifty thousand requests per hour for a web server is nothing. one hundred and fifty thousand per second is the challenge. We feel that we should be able to send one hundred thousand messages per minute.

p. We found the architecture flawed for that level. We kept thinking “why can your web server serve so much and not your mail server?” So we stepped back and approached it from that angle.

p. So Message Systems serves the same purpose as Exim but the environments it’s deployed in are usually different. If Exim works for you then use it, it’s a great tool. You probably don’t need something like Message Systems. However, if you are running six Exim boxes you should probably be running one Message Systems box. That’s the goal, to tackle bigger and more challenging problems. If your problems aren’t big and challenging, there are a lot of tools that you can use. Your choice of tools get limited as your problem gets bigger.

p. **Is Message Systems an off the shelf product or is it a service you install for a customer?**
It’s an off the shelf product. You can download it and install it. Everything you need is there and it has excellent technical manuals.

p. **Let’s talk about your book for a bit. What is in the book and more importantly, why did you decide to write it?**
It’s interesting. The book is titled “Scalable Internet Architectures”. I’ve been giving talks for the past four years on that topic and I repeat myself every time. It’s an interesting talk because it doesn’t revolve around any specific technology. It’s not a code based talk. So every year, it’s pretty much the same because all of the same things apply. It’s mainly principals, policies, and basically a systematic approach to building things that don’t break and building things that are big. Rules of thumb so you don’t shoot yourself in the foot when you build big systems. These will be the same next year as they are today. It’s a mentality.

p. So I kept giving the same presentation and got good reviews from the audiences. People were telling me that this really helped change the way I think about my problems. I decided that it would translate pretty well into a book.

p. It’s an interesting book. It’s small but it’s dense. It covers a lot of topics on a high level. It’s not the goal of the book to have you walk away knowing how to implement things. The goal of the book is to have you walk away and realize what’s not going to work and what is going to work. It tries to give you enough tools to ask the right questions so tat you can arrive at the right answers. The point of the book is that I don’t have the right answers for your problem, because you didn’t tell me your problem yet.

p. All the other technical books I read say things like “I need to build a website. Well you should use Ruby on Rails.” (It’s ruby this year, in the past it’s been Java Perl, PHP and others.) The point is that I’ve not even described my problem before the book has already proposed a solution. It doesn’t make sense. My books is really a guide for arriving at how to ask the right questions, how to evaluate technologies to make sure the solution really fits the problem.

p. **Let’s move on a bit and talk about scaling, specifically scaling upwards. Without trying to incite a flame war I’d like to know; out of all the languages you have used, which one just does not scale.**
In general, I wouldn’t pick a particular product name or anything, I’ll just say that languages don’t scale. The word doesn’t even apply to a language. It’s like saying, “does English scale”. If you have a lot of people speaking English then I guess it scales. It’s really a bad word for talking about languages. Saying “Java doesn’t scale” simply means that the code you wrote in Java doesn’t scale well. That’s because of the code you wrote, not Java.

p. **So an application written in any language can scale?**
Absolutely, there’s no reason that a competent programmer in ADA or FORTRAN can’t do what a competent programmer in C or Perl or Ruby does. Imagine this, if you say “I can do things in Ruby that you can’t do in C.” Well, the Ruby compiler is written in C. So certainly you can do everything in C that you can do in Ruby. It may be a whole lot of work but it certainly is possible.

p. The things that typically deliver you to a scalability nightmare are choosing a solution that didn’t fit your problem. The biggest offender of that in the world, I would guess, are frameworks in general. Frameworks aren’t band anymore than Java is bad. It’s just that if you choose a framework because it’s cool but it doesn’t solve your problem, chances are that you are going to end up with a variety of problems you didn’t expect. The other challenge in the industry today is scaling databases in general. They tend to be single points of scaling so people try to scale them vertically instead of horizontally. Scaling databases is a really big challenge. The problems tend to be so different that you arrive at different solutions. There’s not a single “turn-key” solution that you can implement that will fix all of your database scalability problems.

p. **What is the most common problem that you are called upon to help people overcome?**
I think the biggest problem is that they use some sort of blueprint for their application that delivers them to having a single database on the backend. It’s interesting, at the end of my talk I give a little example, sort of a brain teaser for people. I ask them to try and scale a system that supports a million users on-line. The system results in five thousand inserts per second and two thousand queries per second. That’s a challenging problem with big tables and it’s not an incredibly high-performance system. People always tell me “I can do that with my database.” But that’s not really the point. The point is great, now your system just got one hundred times bigger. Now we are talking about five hundred thousand inserts per second and two hundred thousand selects per second. Every solution originally proposed suddenly has to change dramatically as the problem changes. The goal of scalability is not to have to change the solution, you just grow it.

p. So to answer your question, people keep sticking everything in their database and that’s the real problem. If you put user preferences in your database and every time someone goes to a page it results in fifty-eight queries on the database. If you have one hundred users or one thousand users or even one million users, you can usually pull that off with brute force. However if you go to one hundred million users or your site becomes a whole lot more interesting and your users are visiting pages more rapidly then you start to realize that your database is a point of contention. It’s extremely hard to scale those systems horizontally.

p. **Ok, so you don’t stick everything in a database, what do you do? What are other options for storing persistent data that would allow your system to handle the traffic as it increases?**
Well, I think defining persistent is an interesting first step. How often does this data change? Does it absolutely need to be in the database at every change point? So say, during the life of my session, my session information changes 45 times. Does that matter? Does it have to be in the database every single time, do I need an audit trail or is it ok that I got that information and I changed it and at the end of my session, maybe it goes back into the database? So if I have one million users and they all do one hundred session changes during their session, do I do one hundred database requests for that or do I do two? Those sorts of things are extremely important. Specifically with user information, you know. You have the world’s most efficient distributed database at your fingertips, which is the user’s computer and cookies. You can store information there and you can even cryptographically protect that data so they can’t meddle with it. You can compress it if there’s too much by using gzip before you store it in the cookie. Those ideas all scale very well horizontally.

p. You have web servers. People always tell me they don’t want to cryptographically encode the cookies because crypto is too expensive; the same with compression. Well, you are compressing and encrypting on these web servers already and you can add more web servers. Using this method, you are storing the information in a place with the user. This means that you have as many nodes in your database now as you have users. If any one of them crashes, there are no availability concerns because they are not using their computer when it’s crashed. It’s like a perfect distributed system. People don’t leverage user-side cookie storage enough.

p. People tell me “I have 100k of data, this won’t work for me.” My response is always, “No you don’t, you have 100k of redundancy in your session. Go look at it again.” I’ve worked with clients who do ninety SQL requests per page and their users switch pages rapidly and will visit fifty to sixty pages in a session. Sixty pages times ninety database requests per page is a lot for a single user. They tell me “I have too much data to put it back in the user’s cookie.” So we sit with them, we stare at it for a while; we help them figure out exactly what they need and what they really don’t need. We turn true and false into bits. By doing this, we’ve been able to take those pages that do 90 database requests per page down to one database request on the first page and zero on all others. So now we’ve gone from more than one thousand database requests per user to one.

p. That’s how we help our customers stay ahead of their competition. If you spend all your resources solving problems that someone else can solve for you in a really simple and scalable way, your competitors will beat you on the core technologies. You end up just playing catch-up and it’s a really bad situation.

p. **So is it your recommendation that when companies are building out their new web application that they build for big or should they concentrate their efforts on what makes their site unique?**
You don’t ever want to build for big. You do, however, want to architect for big. It’s important that you understand the answer to the question “When my traffic goes up by a factor of one hundred, how will my architecture have to change?” This and others like it are generally easy questions to answer. Some of them however, will kick you in the butt. There has yet to be a site that everything was perfect when things started to scale up. So when you have users and you are storing them in a database you can say “great, I have one million users and everything it working fine.” What happens when you suddenly go to one hundred million users? Asking yourself that question when you are designing your system may lead you in a different direction.

p. **One question about frameworks, it seems to me that you are saying that evaluating and selecting the correct framework for your project is a critical piece.**
Yes. Selecting the wrong framework is a nail in your coffin. I’m not going to say that there is a right one. I’m not really a big fan of frameworks. It is my belief that the reason there are so many frameworks is that there is exactly one framework for each really smart person with a unique problem. This means that if you build your site well and it’s a unique site, which means your problems are different than someone else’s, there’s going to be one more framework out there and it’s going to be yours. All these frameworks that are popping up are individuals solving their own unique problems. It’s the blinded belief that a particular framework is a “save all” that is the real danger.

p. **What is the one new technology that really excites you?**
I’m kind of lost in all of it. I enjoy big problems and I tend to find different big problems all the time. Most of the stuff I have been doing lately is database oriented. I guess the thing I’ve been most excited about over the last year has been Solaris 10. We do a lot of trouble shooting at OmniTI. We get a lot of calls from people who say “My web site is broken, what’s wrong with it?” Whenever I get those calls and I find myself working on a Solaris 10 box, all of a sudden I’ve got every tool I need to accomplish the job quickly. Solaris 10 has a really impressive feature set.

p. At OmniTI, we live in a world where our clients don’t really care about technology, we care about technology. Our clients care about business. So if the solution is closed source or open source, they don’t care. They just want the right tool for the job. So we have a lot of solutions that end up being ninety five percent open source and five percent closed source. Solaris, with their API across releases, is a god-send for commercial software.

p. **Finally, what blog or website do you read on a daily basis?**
OmniTI’s internal ticketing system.

I’d like to say thank you once again to Theo for taking the time to sit down with me. If you’ve not read my review on “SCALABLE INTERNET ARCHITECTURES”:http://www.amazon.com/Scalable-Internet-Architectures-Developers-Library/dp/067232699X/sr=8-1/qid=1159187862/ref=pd_bbs_1/103-9726978-8689424?ie=UTF8&s=books you can find it “here”:https://devzone.zend.com/article/893. If the topics we discussed in this interview are of interest to you, you will really like the book.