The question of "HOW". A SAAS story

SmokeTree

Developer/Linux Consultant
BuSo Pro
Digital Strategist
Joined
Sep 23, 2014
Messages
289
Likes
450
Degree
1
I haven't been posting much because I've been deep in the code-woods working on a new lead-gen service that I am partnering with a client on. This is a rewrite of a previous system I had designed. I've been able to pick and choose whatever technologies I wish to work with and that has made all the difference in the world. Building this project got me to thinking about BuSo a lot and how I can best contribute.

We talk a lot here about things we've built. Just take a look around at all the success stories and sometimes failure stories that are shared here. There are many of them from some of the best of breed marketers out there. We talk about the SEO side and the marketing aspect, which is honestly the point of all of our code and hours spent. I'll be the first to say I don't have a ton to offer as far as SEO and marketing knowledge drops go. Most of you really have that covered in a very eloquent and succinct way. Thank you for that, it's much appreciated and helps me be more effective in my own projects.

With this in mind, what I feel I can offer BuSo is a bit over 30 years experience as a programmer and "how" I build some of the things I do. I don't have any code-ego issues that I'm aware of and I'm always happy to talk with fellow devs about what languages and techniques they are using. Me being at this for 30 years doesn't make me any better than anyone else but it has definitely given me a lot more time to make a lot more mistakes and learn from them.


Lets Talk about "HOW"

So, I see these great stories of WHAT you've built and WHAT you're doing to market that product/idea, but it's rare I see something about HOW it's built. I think a lot of folks would like even a bit of insight into what it takes to get more than a "handful of scripts" project off and running and maybe even a glimpse into the mind of the developers that build these systems.

I'd like to share with BuSo the general architecture of a system I built and pretty much where my mind is at as a developer these days. I know we have quite a few experienced devs here and you might find this boring. If so, no worries, I'm just hoping to share some insight here and challenge those that might not be doing this type of thing to expand their knowledge. To "up the game of the collective" so-to-speak.

TLDR Warning!!!

Yes this is long. Also, this may be very boring to most, especially if you're not into the code side of things. Some of you may be working with the things I describe below and if so, spare yourself and turn back now, lol. For those that continue, I hope this can at least challenge the way you think about a medium/large scale SAAS type project and how the same mindset can apply to practically any project you can think of.

A bit of background

This is a SAAS that does one thing in life. It provides access to a database of data that is aggregated and scrubbed from about 20 different sites. The data is oftentimes not complete so we use services like Pipl and outsourced data companies to help us fill in the blanks as our customers are interested in complete names/addresses/phones #'s etc. The data that is scraped involves going through a series of "filters" so that the end result is clean data (kinda like the way Google works only this is pretty effective). The system is currently in use in-house and has been running for about 5 years with no show-stoppers. I designed the original system in C#, primarily because the only way I could really drive a browser was with either the junk IE component for .net or using Awesomium so I went with the latter.

And it worked and worked quite well, but there's a difference in something that "just works" and something I feel good about maintaining, scaling and supporting. It was time to say goodbye to what we had and refactor the whole deal into something web-based. As developers, I feel we should always take a "how can I refactor this" stance with our projects, even if it means going with a completely different approach. Just spiking away at code to get something that barely works and then abandon it to do the same thing with other features is going to end up with a crap project held together with duct tape of "at least it works" code and a mile long todo list which, by the time you realize it, you'll be doing a complete re-write the right way anyway. Spike to get it going then refactor. Refactoring doesn't mean "make everything perfect", it means "make this a house I'd live in". So while we're on the subject of metaphors...

Smoketree and C# - Today on divorce court

So, this was designed "the right way" but hell, let's face it, it was a Winforms app... The front-end was a Winforms app that also "talked" to the back-end (I used RabbitMQ for process inter-op). I gotta admit, it works well and has never crashed, but times change. We are no longer in a "winforms" world, haven't been for years, too many to feel that pain anymore. I did what I had to at the time because it was the best compromise of the path of least resistance and providing value to my client. I never enjoyed working with C# and winforms although I am pretty proficient at it. I "get" it, I just don't get why I have to write so much code to do so little. This client was the last of my clients that I was supporting a C# app for (most of my work is web-based) so it was time to have a talk about the future of this project and moving away from anything to do with a dependency on M$ tech. It was time for a change.

Project "Binary Phoenix"

It was decided that we expand to a SAAS in 2015. I was once again offered the opportunity to design the system from scratch, using what I felt was best. I have been working with Rails since the beginning and Ruby since about 2003 or so so naturally I picked Rails for the base framework. I'm a programming language addict. If it's out there, I'm anywhere in the range of "I've messed with it" to "I'd consider my knowledge with it solid" as far as my skill level. I have stuck with Ruby because I'm happy with it. It's a joy to code in and I can go from idea to implementation with code so succinct tears are shed.

Let's take a walk-through shall we?

Front-end:
  • Rails 4.2 and all the extra niceties that come with it (coffeescript/javascript, jquery, sass etc).
  • Bootstrap & Font Awesome
  • Jasny and Fuel-UX for certain components. For instance, Jasny makes it easy to do the "off canvas" thing.
  • Paloma gem for per-page javascript
  • Several gems for role based access control and authentication (devise, pundit and royce).
  • Moment.js for date handling
  • Websocket-rails gem for push/pull notifications. Will also use this to implement basic chat functionality with customers without having to use bold chat or something like that.
Back-end:
  • Linux (Ubuntu): I've used pretty much every distro of Linux up to and including back in the "slackware" days. Hell I got my first taste of UNIX with HPUX and SCO. Out of all the distros I've worked with, Ubuntu has been the one that gives me the less headaches and gets well out of my way when I need to do things differently. I use Ubuntu for pretty much everything. If I need to do something that is all secret squirrel secure, I'll go with CentOS or the like, but I need a damn good reason for it.

  • Apache with Phusion Passenger to support Rails: (http://httpd.apache.org/ and https://www.phusionpassenger.com/) This is relatively painless and I've been working with apache for quite a few years and am comfortable with it. I'm not entirely sold on apache for this use case and am considering nginx with passenger or a combo of nginx and puma.

  • Git for repository: I don't use github much really for storing my code. It's a great thing and I don't mind contributing to the code of others to help the community, but really, I just don't want my code on a server I don't have root access to. Sorry but that's just how I roll. I have a few "gitolite" servers (https://github.com/sitaramc/gitolite) I have set up to store my source code. Each time I edit my code and it works, I just commit to to the central repository. When I deploy, my deploy scripts (capistrano) get the most recent codebase that I have commited and use that for deployment. If you aren't storing pretty much your entire life as a developer in some kind of revision control system, it's time to up your game cause you're doing something horribly wrong.

  • Capistrano for deployment: (http://capistranorb.com/) Please don't tell me you still deploy software projects with SFTP, SCP, etc? Isn't it much nicer to just go to a command line and run: "bundle exec cap production deploy" and have your whole project deployed to your server, the server restarted/reloaded (if need be) and other things like ensuring file-permissions are straight and any other commands you have to run are taken care of? Yes it is. It's lovely, so look into capistrano. You can deploy pretty much anything you want, it doesn't just work with rails. Just please, don't use the SFTP/SCP route to deploy your stuff anymore. We've already partied like it's 1999, those days are gone.

  • Supervisor for process monitoring: (http://supervisord.org/) Supervisor does what it says on the label, it monitors processes and restarts them when they shit the bed. You don't need to write specialized daemon processes unless you have a good need or you're a masochist. All you have to do is write a script that does the infinite loop thing and send the output to stdout. Supervisor will take care of logging the stdout and stderr and will keep track of the PID for you.

  • Ruby processes that take care of scraping: (https://www.ruby-lang.org/en/) Since the sites we're scraping make use of Javascript calls, I'm using the poltergeist gem which is a driver for Capybara that allows me to easily use PhantomJS with Ruby. Some of you might be familiar with PhantomJS by having used it in other contexts such as with CasperJS. It's basically all PhantomJS underneath, just a different wrapper. Find out more about PhantomJS here (http://phantomjs.org/) and poltergeist here (https://github.com/teampoltergeist/poltergeist).

  • Compiled Go code for classifying and general maintenance: I seriously dislike working with C and C++ code for things that require speed, I always have. It's not because I don't understand it, it's just because it's really not "pretty" code no matter what. With Go, I can get pretty close to the speed of C/C++, the code is compiled to machine code and I can also do things with concurrency patterns that you wouldn't want to touch with threads. (https://golang.org/). Go is used when I need that shot of nitro.

  • MySQL for database: Nothing much to be said here. A good solid database. I shift between MySQL and Postgres mostly. If I'm doing something small, like an app that runs on a raspberry pi, I'll use sqlite. (http://www.mysql.com/)

  • Redis for all queues and to facilitate process inter-op: (http://redis.io/) Here's what I hope you take with you, if nothing else. How many times do you have scripts/process that need to "talk" to each other? If you've tried basic RPC patterns in other languages, you'll know it's seriously a PITA to do this and ya, it'll work for the language you wrote it in, but what if you need to communicate with processes written in other languages. Like, a VERY simple way with no BS flaming hoops to jump through? I'd like to challenge your thinking a bit. How about only having to worry about 2 things? They are
    • Can my language communicate with redis?
    • Can my language allow me to easily work with the JSON format?
    That's it. Redis has libraries for any language I can think of in use today. Additionally, the same applies for JSON libraries. Here's a working example:
  • Each process initializes and registers itself in redis where the information is stored in hashes. Every process maintains its own state and just passes on the part of its state it wishes to share to redis. This is updated anywhere from every second to every minute, depending on the process and how chatty it needs to be.

  • Each process creates an open channel to itself using the pub/sub capabilities of redis. If I want to know the state of another process or operation, I just communicate with the process directly via its channel. The processes never communicate directly with each other, they only need to worry about redis.

  • There are many other queues and lists that are used to hold scraping tasks, progress of said tasks and a plethora of other objects of interest.

  • Most messages are passed in JSON format. A typical "scrape command" might look like something like this:
    {"command":"do_scrape","task_id":5150,"listen_after_processed":1} .
    If a process is asked for its state, it looks something like:
    {"process_id":"6f3011fc-d386-47ce-872e-8a6adfe81d26","current_action":"sleeping","last_checkin":"1422121755"}.

  • Since every language I'm using understands the above JSON format and can also communicate with redis, I have a very easy way to communicate with all working parts of the system. So maybe for some reason I have amnesia, forget the pain of PHP code and I decide to write a scraper or some other part of the system in PHP. No problem. My PHP process doesn't really care about the other parts of the system and what language they are written in. The code is only worried about whether it can talk to redis and send/parse the return messages and know how to react.
So why redis? For one, it runs in RAM but you can also persist to disk at predefined intervals (I do this about every 3 seconds or so) so you're not going to lose it all if the server shits itself. Yea, I know, you're going to tell me "But MySQL has memory tables". Sure does, but it doesn't have an easy way to implement ways to ensure that a given task is only given to one process (think transactions and locking hell in MySQL) and I also have other needs that MySQL wasn't built for like easy persistence to disk.

With redis I can just "pop" a task off the queue and rest assured that it won't be given to another process that just happened to ask for the data at the exact same time (yes it happens, albeit rarely). With pub/sub (publish/subscribe) I have a very lightweight way to communicate with any other part of the system at any time. I can have tables in MySQL and sync them with objects in redis that contain the same data. Any time I have to read data, I hit redis first because the lookup is pretty much instant. In short, redis holds it all together and does a beautiful job at it.

This system has been running in pre-production for about 2 weeks now and I don't anticipate anything tragic to happen. I've done my homework and built something I feel is ready for prime time. This system will be generating around 10k a month out of the gate as it's just a rebuild of an old system that has been generating revenue for a while now.

So to end this highly long-winded post, I really hope that some of the above may give some of my fellow devs a few new ideas or even a completely different mindset. Nothing I'm sharing here is particularly new or very innovative, it's just the patterns I've been using the past few years. In other words, I didn't invent any of this, I'm just using the tools above for what I feel is the best purpose. Also I'm sure you "can do this in language X" too. That's what makes things interesting and I'd love to hear about how you're using your favorite tools to build your projects.

So I ask my fellow builders, how do you build what you build?
 
Wow. I have no clue what most of this means, but I do appreciate the big-picture of keeping your infrastructure current (and in a version that you're proud of). Thanks for sharing all of this. I know others here can make good use of it and share more back with you.

All I can really add is "click the one-button Wordpress install in cpanel, login, choose a theme, and type some stuff." That's how I build right now. I try to play with the HTML and CSS where I can but I need to study that more too. It's frustrating to know what these things do, but not know HOW to make them do it. I have some homework to do in that area.
 
Just stopping in to say that @SmokeTree is the fucking man.

Even if you don't know what a lot of this stuff is, by reading it you can see the mindset that he has that makes him so good at what he does. And how hard and long he works to be good at it.
 
Big thumbs up for putting so much time into sharing this comprehensive methodology.
Great demonstration of passion and aptitude- inspiring to say the least.
Thanks for coming out of the woods and inducing motivation for my own endeavors...
 
Nice.

Having been part of replacing old "but it's working" software projects, this is good.
Seriously Go, though?

WOW

Now I have to go and look at redis, and capistrano.

Thanks!

::emp::
 
@SmokeTree What's your optimal structure of a queue system (using Redis of course).

I've got several scenarios where I'm switching the overall processing of data so it better scales to over 100 servers. I figure a single Redis server that holds all data, and the end resulting process from worker bots, then uses one or to "writers" to send data to the main Database, to reduce overall pressure on the MYSQL. The main purpose is as expansion occurs MYSQL can only handle so many simultaneous connections, and it's really not efficient in speed. I know it sounds crazy but 22-45 writes per second is way too slow.

So here is the setup I want to implement:

Supervisor_Reader: This is a single bot that loads up all the process and tasks which need to be done for the day/hour, or whenever. It's basically sets up all the data that needs to be processed within Redis. It only reads from MYSQL, to Redis.

Worker: These guys are the ones that will process all the data. There are generally a dozen task each can perform, functions stored within a class library so upgrading each worker is as easy as syncing files across a vast network. (That reminds me, what do you recommend for pushing out updates across a network? This "capistrano" thing? If I have to touch ruby it's a no go. I need something I can "push" and 100+ servers get the new files. I thought Gitbox (git) could do all that, but I might be doing something wrong.)

The workers will check the RAM of the computer they are on and not go over a set amount, they will also check other specs so each worker knows the limits of the server they are running and know when to pull back. Each worker then "lpop" the next task in hand, and depending on the type of task, it will then process that task and send the saved data to the main Redis. In that process it either expires or deletes the data used.

Each work has their own ID, and it's tagged within the saved data to know what originated this -for future debugging in case of problems. Worker bots will no longer need to write to MYSQL, cause MYSQL is extremely slow, and locked tables can affect overall performance. I've seen instances of cascading ripples caused backups data cause 1 bot decided to act up. No more of that.

Supervisor_Writer: This is a single bot that writes all the newly processed data to the Main MYSQL. In fact this bot in theory can be duplicated across the system if the writes become so much for a single instance. I know this way won't effect MYSQL the same way since currently there are 100-200 mysql users writing at any given second. What I'm essentially doing is dropping all that down to a single mysql user which writers.

--

Also, BGSave state had to be completely disabled since in early stages that act of saving data was causing delays, since we were pushing 2GB of data per minute into Redis and then deleting that data in less than 60 seconds. BGSave state would try to save every x amount of commands or every x amount of minutes and look for data that's pretty much a ghost by then. This is BEFORE the new queue system is in place, so I figure at max 10GB of data per minute being created and deleting within a minute is a realistic max for capacity. If the Redis server shits itself and comes crashing down it's not a big deal as long as it doesn't corrupt the main database. Each bot is able to recreate it's structure needed within Redis already.

---

My analogy/setup comes from a factory scenario, with supervisors and workers handling tasks. At least that's how I envision it. I guess what I'm looking for is to understand what an optimal queue should look like for maximum efficiency using Redis, and if I'm missing anything. Do you have an example queue structure/setup you can share?
 
Do you have an example queue structure/setup you can share?

@CCarter You got it. Here's a birds-eye view of how I'm doing something very similar. I believe the pattern I'm using (Actor based concurrency) could be a good fit for what you describe.

My setup is mainly a mix of Ruby and Go. The same thing can be done in many other languages. Feel free to experiment.

This particular system is managed by a Web based front end in Rails. My client can do everything from scheduling the various tasks to starting and stopping the various parts of the system with a browser. I hardly need to log into the back end to do anything other than general server maintenance, and I even have most of that automated too.

The backend is written in Ruby (with Go mixed in here and there) and I am implementing the actor model (https://en.wikipedia.org/wiki/Actor_model). Every part of the system is an actor. A dispatcher is an actor, a consumer/worker/agent is an actor, etc. All of the actors are managed by their own supervisor that is responsible for invoking them and managing their state. If an actor dies, the supervisor process will restart it in a clean state, ready to try again. On the process level, I use supervisord (http://supervisord.org/) to manage the individual processes and also to start processes at boot.

MySQL is used to store the "big" stuff (i.e. the final data) as well as the tables needed as part of the system configurations. Things like passwords or API keys are almost always contained on the server as environment variables that are specific to the user the system runs as.

Redis is used for just about everything else. Everything from queues to storing task information for tasks in various stages of processing. The Task Queues are implemented using lists. Lists are the perfect data type for queues because you can get atomic operations like lpop that ensures only 1 worker gets a given task.

Meet the Actors

Dispatchers

Dispatchers handle and regulate tasks for all worker processes in their scope. I might have 200 tasks that need to happen, some more involved than others. Every minute the dispatchers check the schedule tables and if there are tasks to be done, the tasks are pushed to their respective queues using "rpush". A "task" is a JSON string that contains instructions for the worker as far as what job to do and in certain cases, how to do it.

Workers/Agents

A worker is basically any actor that needs to do something. I have scrapers, health monitors and various other things. Here's the workflow of how a scraper works.

A worker starts, going through basic initialization and checking in. I use GUIDs that are assigned to each process and each one has to "register" to make sure the GUID matches or the process will not work. I have processes running on various servers and this helps add an extra layer of identity and security to the mix. So let's say all is well and the worker was able to "register". It then listens to its queue for any tasks assigned to it and checks every second in a loop that will try to "lpop" a task from the queue. Let's say a task is found (otherwise lpop would return nil). This would happen:

The task is now "popped" out of Redis and into the worker's memory. The worker reads the task, does some quick validation of the instructions and then moves it over to an "in progress" list which I store using hashes. The reason for this is that, let's say the worker process dies while it was trying to do something. The supervisor process will bring the worker back to a clean state and the first thing the worker will do is check the "in progress" for a task matching the worker's id. If a task is found it's picked back up and resumed. This helps ensure there are no missed tasks. So now we have the scraper, scraping away at whatever data the task specified. I could have the scraper write its findings back to Redis or MySQL. We're not dealing with a huge volume of concurrent writes so letting the scrapers write to MySQL is fine for our use case.

All throughout the lifecycle of a task, the workers send status updates to Redis for their task. Using the front end tools, my client and I can view the status of all of the actors in the system. Since the workers all have their own configurable logging levels/methods, when something goes wrong, I'm not only alerted of it, I have a good idea what's wrong and how to fix it.

When a task is complete, the worker moves it from the "in progress" hash to a "task completed" hash. The start/stop time of each task is saved as well as other operational data that I can analyze later.

Sweepers
Sweepers are actors that prune db tables and "sweep" through the system making sure everything is OK and the system is in a clean state. This sounds simple but sweepers have a tough job.

Some notes about my setup:

The whole system is asynchronous for everything but the scrapers because they block (their own process) when they are in the middle of navigating a site and waiting for page load and such. Also because I'm making extensive use of logging, I have excellent insight to system performance metrics. Most importantly, no one part of the system can crash the whole thing. Each process has its own supervisor that will take care of bringing it back up if need be. It has been about a year with this system and I haven't had a major crash or anything resembling a showstopper.

Some thoughts on having a single process or group of processes that are the ones that write to the DB:

Having a single writer or even consolidating writers to a few instances can help mitigate DB load. With this kind of thing, the burden is on the writer process so it's definitely important to build in good logging to monitor performance and robust fault tolerance. The main thing you're going to have to deal with here other than reliability is speed. You don't want the system (or users) to be waiting for anything. Where there's waiting, there's blocking.

Deployment

I use Capistrano to deploy everything these days (http://capistranorb.com/). Yes it's written in Ruby and does require a small amount of configuration. What do I get for that work? Well, here's how I deploy:

  • On my dev box I "cd" to my project directory. I make sure the code I want to deploy is checked into its git repo and pushed up to the server I host my git repos on.
  • I run the command "cap deploy production"
  • I sit and wait while capistrano does the rest, logging into various servers, updating the code by checking it out of the central repo. When it's done with all that, it will restart my webserver and any anything that I would need to do if I was actually deploying manually.
There may be something else out that there does what Capistrano does but I haven't seen anything that is near as feature rich. A deploy of a complex site/app takes me about 2-3 minutes. 10 seconds of that is doing the first two steps above, the rest is just waiting and making sure all goes well. I haven't touched an SFTP app in years.

I hope this give a bit on insight to how I code and my workflow these days. Builders, feel free to add in your own tips, tricks. My way isn't the only game in town (thankfully, lol).
 
I have a question, I see alot about the mention of MYSQL. However have anyone of you @SmokeTree or @CCarter used MongoDB? It's a NoSQL based DB which relies on key:value pairs as the format.

I don't have 30yrs of dev experience but what it sounds like to me is the DB is not efficient enough to do the high volume work you need completed. It maybe time to look into a more modern based database for what you need done.

The reason I mention MongoDB is alot of startups are building on NoSQL databases in order to anticipate viral traffic or explosive growth in their apps, products, etc.

On a side note has anyone used MeteorJS yet? It's Javascript from end to end (it can run on the browser and the server), offers real time (Reactive) capabilities right out of the box amongst other things.
 
@btcquake, actually at one of my SAAS we are introducing a new interface that's 90% Redis NOSQL and all live data is stored in RAM so it's even faster than the speed of sound in response. One thing we monitor closely is the response time and looking at the patterns of user's usages, we found out that we can store 90% of their interactions with the interface within Redis as soon as they login, so there is a complete reduction on usages on MYSQL. Response is already fast averaging 0.9 seconds for 80% of users, but with Redis we are approaching terminal velocity.

With the early test we are seeing response as low as 10 ms for data processing, it takes the browser, javascript, etc longer to load on the user side than respond with the data the user asks for, so that will be my "mission accomplished" moment when that goes lives.

But also in my scenario for the Redis queue, all the processing is being done through Redis, saved on Redis, and only when it's time for the "final" writes is when it is written to MYSQL as the final home. So the queue system is a complete switch from traditionally using MYSQL for writing, reading, and do all the talking to processes, cronjobs, queues, other work, to just rely on the Redis NOSQL method for processing and even user interaction.

We've already seen a 12x speed increase in processing time when removing MYSQL from the equation on some of the lighter processes, so once both backend and frontend are off of MYSQL in terms of interacting with it, it'll just be used as a "final" storage.
 
@CCarter, cool it seems as though you are aware of the benefits of NoSQL & Redis. But totally understand tech stack changes don't happen over night. Thanks for the reply.
 
@btcquake, I've tried a few NoSQL DBs such as mongodb, couchdb, riak, redis, etc and like a lot of what NoSQL has to offer. I understand that in certain cases having the data in more of a document based or key-value format makes sense over the tried and true approach of relational databases and data normalization.

At the same time, relational DBs like MySQL, PostgreSQL, etc offer features that are important to me like data integrity and type safety. I like knowing that I can define a field that I expect to only contain an integer and I'm not all of a sudden going to find strings there. Referential Integrity is another thing. I don't have to write a ton of extra code to ensure I'm not going to end up with orphaned data, because I can let the DB take care of a lot of that. In short, right now my trust for the "final data store" is with relational DB systems in a normalized format.

These days, I'm a fan of a hybrid approach. Using MySQL as the final data store and using NoSQL (mostly redis) for things like session data and other data like statistics, analytics, social and other data that is not "mission critical" in the event of a crash. Data from MySQL can be queried and de-normalized into a NoSQL DB when need be. That way I'm getting the best of both worlds and I can rest easy, knowing the data will be consistent.
 
@SmokeTree I like the hybrid approach of NoSQL for non-mission critical data, and MYSQL as the "final data store". Hmm brain is stewing. Thanks
 
General answer: I've got a couple of SaaS projects built on Laravel. I generally spin up a quick vps w/ forge and get nginx and all installed and rolling. Then I use bitbucket for repo and push deployment to branches staging and production/master. I generally rely on bootstrap and jQuery (maybe some angularJS sprinkled in) for my front end. This suits my needs for 90%+ of quick MVP projects.

For Wordpress stuff where I don't need to reinvent the content publishing wheel, I have a hand coded wp starter theme I created that I just modify for the project. I've also written my own plugins for pretty much anything I care about.

I've recently started toying with redis - and I love it. Similar to what others have said, storing user data at login and then writing to MySQL at intervals has seriously sped up my application response times. I tend to offer huge dataset style "paid access" SaaS products so the redis model described in the posts above fits most of my use cases for relieving pressure from MySQL.
 
Back