How to Install Ferret, the Full Text Search Engine with Your Rails Application

Ferret is a full text search engine based on the popular Lucene Engine, which is originally written for Java. There is a great tutorial available here: http://www.railsenvy.com/2007/2/19/acts-as-ferret-tutorial

Assuming you have the plugin installed, acts_as_ferret in your vendor/plugins directory you’ll need to install the ferret engine on your box. Use gems like so

gem install ferret

If you’re running in dev mode, it should just work automatically, however in production, remember to start your ferret server in production mode with the following command!
./script/ferret_server start -e production

Starting the Rails Console in Production Mode

To specify which mode you’d like the rails console to boot up in, just provide the string without any flags.

./script/console production
./script/console test
./script/console development

If you’re on windows, remember the backslash “\” rather than forward “/” and I believe you’ll also have to feed the ruby interpreter as well, like so…

ruby script\console production

When the cloud is a good idea

A buddy of mine is a talented, but paranoid System Administrator for a small web company. He refuses to see the truth and the beauty in the wonderful world made possible by cloud computing services like Amazon’s EC2. Amazon is not the only game in town. There are other, and in my opinion more flexible, providers out there such as http://slicehost.com, http://joyent.com and http://rackspace.com.

cloud1

His main critique involves a lack of control over the physical disks, vague terms of service, in some cases  poor technical documentation and implementations that don’t replicate everything that you get when managing your own server. An example of this is when your server instance fails in the cloud and your data is not persisted, at least not with Amazon. This is no doubt a limitation and a serious one at that.

All these services offer a near identical product with price points at a small difference in dollars. You find on average, $70 a month for 1 Gig of RAM with 30 Gigs of hard drive space. Rackspace has the best offer with Mosso at $43 a month for 40 gigs of hard drive space and 1 gig RAM. You can get server instances as small as 256 Megs of RAM for about $10 a month. These costs don’t include bandwidth, but unless you’re doing a lot of traffic this won’t cost much. A full T1 will run you ~$350 a month with Mosso. As more and more players hop into the game, I wouldn’t be surprised to see prices go down even further.

Management of your server is simple with a nice web interface for rebuilding, rebooting, DNS configuration and scheduling backups. You can choose from the most popular operating systems such as Ubuntu, CentOS and even Windows, get full root access and can upgrade/downgrade at any time at a prorated cost. It’s wonderful and beautiful!

Just because something is easy doesn’t necessarily make it a good idea. But that doesn’t prove the opposite is true either. Just because something is easy doesn’t make it a bad idea. So why is using the ‘cloud’ a good idea, and more importantly, when is using the ‘cloud’ a good idea?

I watched a great presentation made by Mr. Bezos, Amazon’s CEO, in which he related the IT infrastructure at Amazon to the American electricity grid of the early 20th century. Bezos envisions a world where developers can just ‘plug’ in to an infrastructure that delivers everything you could possibly want from a dedicated hosting solution at a fraction of the cost. An infrastructure that maximizes power and efficiency and makes it easier to administrate your ‘server’, not more complicated.

Amazon and the rest of the lot can do this because they deliver a service that a lot of people use in a similar way and can scale costs accordingly. Amazon in particular, with the infrastructure requirements for their flagship product, Amazon.com, gives them expert industry knowledge and a massive operating budget. Smaller hosting companies have taken advantage of open source technologies like the Xen Virtualization, to deliver scalable dedicated virtual boxes.

Bezo’s has a beautiful analogy. It does have its limits. Computing is not the same thing as electricity. Appliances for the household consume electricity in the exact same way. Applications for the web, however, do not consume compute power the same way.

Where cloud computing offers a lot is for the lone developer, small team or startup in the proof-of-concept stage. For those who understand software but not hardware. Where the investment of time and resources adds a burden that subtracts from the development of their product(s). And when the product(s) they deliver have nothing to do with the means they choose to deliver them. The developer or small team just require something that is secure, consistent and reliable.

I won’t argue that the cloud will ever be more secure, consistent and reliable than choosing to do it yourself, if you decide to do it right. Maintaining 100% control over the physical disks and having direct access to the physical collocation is more secure, consistent and reliable without a doubt. However, the caveat is that because you’ve decided to do it yourself you have to do it right! If you don’t do it right you’ll most likely have a setup that is neither secure, consistent and is very, very unreliable. So this asks the question how does one do it right and how much does it cost to do it right?

The cost can vary but it does involve hiring a System Administrator for 50 – 100 K a year and spending over 10 K on equipment and collocation facility hosting.

If you’re that lone developer, small team or startup in the proof-of-concept stage, you may or may not be able to afford these costs. You can of course contract out and the cost would be lower. But I’m comparing the cost of a setup that has someone fully dedicated to it. I’m also assuming that the developers, who know software, will be able to lean and manage the server remotely and thus my argument involves eliminating the formal role of System Admin. Because you no longer need to worry about the physical limitations of your setup, involving hard drive failures, new OS installs on new machine, UPS, basic redundancy and the rule of 3, developers can wear two hats.

How can the cloud be as consistent, secure and reliable? The answer is that Amazon and other cloud providers are hiring engineers who make six figures. Not just one or two, but tens or hundreds or even tens of hundreds, more than you can afford to hire. It make sense that they will have the resources, knowledge and experience to deliver a product that is consistent, secure and reliable.

Bugs and such are inevitable, after all humans are still doing the work. However, the bug risk isn’t eliminated just because you bring it in house. In fact, odds are that the risk is higher!  Who would you put your money on, 1 or 2 guys at a small company or the combined knowledge from all the engineers at Amazon.com? Now, that is an extreme comparison and of course there exists a middle ground. If you happen to find a great admin who does quality work and understands business goals etc, his consultation might be worth the price. However, if you’re not so lucky, a product from the cloud might be less risky than the infrastructure you invest in – the IT fort some employee constructs for job security!

You get what you pay for without a doubt. It’s just that in the cloud it’s not just you paying and therefore you get more for your dollar.

git checkout — file-in-question.oh-my

Problem: using git and one file, like your db/schema.rb, is out of whack with the latest branch. If you run git pull and you get a failed merge and no update. You can of course, edit the conflicts manually. But what if you’re confident that the latest schema.rb is correct? Instead of manually editing the file and doing all that work, you can grab the latest copy with the git checkout command.

Solution:  git checkout — <file>
… or for the Rails application example above…
git checkout — db/schema.rb

rails fixtures: using the right timestamp

Fixtures in Rails allow you to quickly and easily populate a database with sample data. They are great when testing your app and you need to test against a condition that will rely on a certain preexisting data point.

Fixtures are located in your RAILS_ROOT/test/fixtures/model.yml where model.yml is the model in question.
A sample fixture *note: yml files require consistent indentation!

one:
  id: 1
  title: my sample fixture
  description: this is an example of a fixture
  created_at: &lt;%= 2.days.from_now.to_s :db %&gt;

You can load your fixture data like so

rake db:fixtures:load

This assumes that you have a schema migration already raked <code>rake db:migrate</code> and that the attributes in your fixtures map to the correct attributes in your schema. You’ll get an error otherwise. You can also specify the environment when you run your rakes like so…

rake db:fixtures:load RAILS_ENV="production"

As you may have noticed, you can execute Ruby code in your fixtures if you place your code in the normal erb tags
<%= %>. Furthermore, you have access to the Rails api, which gives you handy methods like 2.days.ago.to_s. When creating your fixtures you might be tempted to use <%= Time.now %> for you created_at, updated_at fields. Don’t! Instead, use the rails api like so… <%= 1.month.from_now.to_s :db %> This stores the value in the same way that Rails handles time stamping your models. Otherwise, you might find parts of your application break when running tests against your fixtures because the data types do not correspond.

mysql on rails 2.3.2

mysql driver is no longer bundled w/ rails. you’ll need to install it yourself w/

sudo gem install mysql

however, on ubuntu (heron) this won’t work. issue these commands first

sudo apt-get install libmysql-ruby libmysqlclient-dev

if libmysqlclient-dev fails… try libmysqlclient15-dev

then run

sudo gem install mysql