SQL style guide
Originally crafted by Matt Chisholm. Has had only minor edits.
Identifiers
All identifiers
The database schema must be immediately comprehensible to any technical person who knows SQL, without requiring access to the application code, data, or any programmer or DBA.
- No acronyms or abbreviations
- Except when the acronym or abbreviation is long-ish and in such common use that it is an English word in its own right, such as
laser(acronym) orphone(abbreviation). Technical acronyms in common use around the office or in the field do not qualify for this exception. Neither doescmforcentimeters, because short acronyms are easily confused with other acronyms.
- Except when the acronym or abbreviation is long-ish and in such common use that it is an English word in its own right, such as
- No camelCase or StudlyCaps
- Even though SQL identifiers are case-insensitive, you still shouldn't. Lowercase is perfectly good.
- No wordsruntogether
- Do not use negated or negative terminology:
- use
send_claim=Trueinstead ofdo_not_send_claim=False - use
escape_output=Trueinstead ofdisable_output_escaping=False
- use
- Do not use the following substrings. They can be redundant and/or confusing:
- table
- row
- record
- column
- trigger
- key
- database
- db
- user (because it is a keyword in SQL)
- profile
- Do not use vague terms. They can be redundant and/or confusing. Examples include, but are not limited to:
value,length,status,store, &c. - Long identifiers are ok and almost always good. You only have to type them in a few places, most likely once in the ORM (Object-Relational Mapper) and a few more times in the controller. Optimize for comprehension, not keystrokes.
- Always separate different words with _ (underscore)
Column identifiers
- Do not include the data type in the identifier.
sent_booloraddress_stringare not allowed. Usesentoraddressinstead. If you are tempted to include the data type in the identifier, pick a better identifier. - Do not include the table name, or part of the table name, in column identifiers.
- Unique integer IDs on the table must always be called
id, nevertable_idorfoobar_id - Non-integer unique IDs can be named however you like (may change in the future).
- All foreign keys must be prefixed with the name of the table that they reference:
telephone_number_id, nevertelephone_numberorforeign_table_id - All foreign keys that point to unique integer ids must be suffixed with the
_idsuffix.
Index identifiers
- Indexes must be prefixed with name of the table.
- Indexes should list the column (or columns) that they index across.
- Indexes must be suffixed with
_index
Examples:
appointment_id_indexaddress_id_country_index
Trigger identifiers
- Trigger names should contain a verb denoting the action they are performing and the name(s) of the table(s) they are acting on.
- Triggers must be suffixed with
_beforeor_after, and with_insert,_update,_write(for triggers that are run at insert and update time), or_delete.
Examples:
find_duplicate_address_before_updatefix_appointment_date_after_insertremove_patient_dependents_before_deletecheck_telephone_after_write
Table design
Every table should have created and updated columns which are automatically set, and re-set at insert and update time, respectively, by the ORM. If you bitch about this now, you will be the subject of ridicule and abuse when it saves our ass fixing a hideously bizarre race condition bug.
Consider using a deleted boolean column. It would be set to false by the ORM initially, and ORM queries automatically don't return columns where it is set to true. Actually deleting columns can be problematic when there are triggers and foreign key constraints; using a deleted flag lets you hide items from the user immediately and garbage collect later. Or you can use it to un-do deletion.
Any columns which reference a column in another table (i.e. columns used in joins) should have a foreign key constraint.
Avoid multi-column indexes because they take up lots of space on disk.
Be intelligent
SQL supports far more primitive data types than your average programming language. Use the proper SQL data types to store the right kind of data. Anyone who stores dates as strings, ints as characters, floats as decimals, booleans as ints, and so on will be the subject of abuse and ridicule.
Don't stick stuff in the DB that doesn't need to be there. Intermediate data in the process of being imported or converted, constrained sets such as ISO country or language codes, and application configuration data such as file paths don't belong in the database.