Category software development

I recently worked with a data heavy Ruby on Rails project where seeds became crucial to developing quickly. I came to realize there are at least two different ways to use seeds, even though little has been written about these ways and there is no prescribed Rails Way. Here I aim to clarify the approaches, and in the next post I describe how our use of seeds evolved between these approaches (and others) over the course of the project. Before I describe the two ways to use seeds, let’s briefly discuss what we should represent as code versus data.

Code vs data

Understanding seeds begins with a critical question about the difference between code and data. We have to decide whether to represent our entities in methods or constants in code or as objects in data. We often start out setting up things in code or codifying objects as rigid models that we think will remain constant across all use. A simple example would be product categories. But sometimes even these become dynamic later, as your project realizes its potential. For example, we modeled an educational framework at first by creating models representing each level of the framework. Later we realized the project could be used for a wide range of applications beyond education, and we had to allow users to create abstract flexible hierarchies. Generally, create constants in code, and create models to represent things whose attributes you anticipate will vary across users and scenarios.

It gets slightly more nuanced. If you aren’t sure whether something is code or data, remember that it’s possible to change a code representation into a data object (model). It’s good to leverage the benefits of each type of representation for your particular use case. You wouldn’t want to store certain static objects in code, like country codes, billing codes, and statuses, because they don’t make sense to couple with your implementation. A model representation, and especially one that you populate using a seeds.rb file, is perfect for static, constant data that doesn’t change per client of your product and that doesn’t need to live with your business logic. Be aware of the particularities of your project and the trade-offs as your project matures.

Now that we’ve decided what should be code and what should be data, we can talk about what data goes into seeds.rb. Here are two approaches.

1. The bare minimum

First there’s the type of data that you need for the system to function. Think of it like configuring your app for your particular environment. For a workflow app, you might need to define the different stages or statuses in the workflow. Later on, your project might evolve to allow a user to customize their own stages, statuses, or product categories, but just to get the app running so that you can develop a particular feature, you’ve decided to start off your database with these populated. This kind of seeds.rb usage also includes all the static data that you have decided not to represent in code but that you don’t anticipate will change per user or scenario, like user roles.

If you were to actually start the app using these seeds, however, you’d still have no data with which to interact. You have no users, no objects, no instances of your models. Limiting your seeds file in this way keeps it lean and easy to use in any environment, but it makes initial app development slower and more tedious.

2. Quick object creation

Especially when building a prototype or interface that is still just your best guess at what the project should be, you need a way to manually test your features while developing. Seeds help speed up the process of building and testing your app locally because you waste no time creating objects in console or building a UI that helps you create those objects. At such an early stage you’re not even sure that these objects correctly represent what the app needs. You want to spend most of your time validating the interface and discovering how people will use it.

In the early phases of development, this kind of seeds.rb is deliciously useful! In a few seconds, you can populate your development database from nothing to everything you need to see your app in action. You’ll be making lots of changes to seeds.rb while you develop in this way, so it’s useful to have a quick script.

Things to remember

Seeds.rb is by definition destructive. Remembering this will help you avoid issues when you start building your app because you won’t try to rely on seeds once your database is populated. If you plan to use seeds, assume that your database has only tables and no records. The best way to use seeds is for starting up a pristine environment, as opposed to attempting to seed over an already populated database.

You should reseed every time you make changes to seeds.rb. If you follow a pull-request workflow when developing, reseeding is also useful when you switch between branches of your project that have differences in their seeds or schema.

In the next post I’ll discuss how seeds evolved throughout the life cycle of one of our projects.