Naming in Engineering Practice: The Stuff We Always Argue About

2021/10/05

Preface

Back when I was still in school and hadn’t started writing production code yet, I kept seeing seniors in the industry joking about this: coding takes five minutes, naming takes two hours. At the time I had no real project experience, so I just laughed it off. But once you’re actually doing engineering work, you really feel how true it is.

There was an interesting poll abroad that asked programmers to pick the hardest thing in day-to-day work. You might think system design or reading other people’s code would be the hardest, but in the end almost half the votes went to naming-related things.

image-20211004130401300

In engineering practice, naming is probably one of the most common tasks—but doing it well is not easy. Think about it: how do you usually name things? Do you have a fixed naming habit, or do you improvise, look up a word that “seems about right,” and start writing logic? Today a method is called fetchXXX, tomorrow it’s findXXX, or even XXXFind. Because you “just make something up” every time, your naming style becomes a mess, and chances are you won’t even remember the method names you wrote—you’ll end up digging around for ages.

I strongly recommend spending half an hour to summarize and form your own naming style. Then you won’t have to agonize for minutes every time you name something, and you won’t worry about sounding “low” and getting laughed at by coworkers. Your coding speed goes upup, and your code quality also goes upup. I’ve also summarized some naming habits that I personally use a lot, and I’ll share them here. You can spend a bit of time referencing mine and then craft a style that fits your own preferences and your team’s conventions.

Conventions

I won’t repeat the common Java conventions too much here—Alibaba’s P3C already provides pretty professional guidance, or you can refer to GJSG.

  • Package names use all lowercase
  • Class names use UpperCamelCase (PascalCase)
  • Method names and normal variable names use lowerCamelCase
  • Static member variables, enum values, and constants use ALL_CAPS_WITH_UNDERSCORES (except for constants, Java naming conventions generally do not recommend underscores and similar characters)

No matter what style you use, it’s mainly for distinguishing responsibilities and looking clean. The most important thing about naming is self-explanatory, and the highest level is code as comments.

URL path naming

To me, the importance of URL path naming is in long-term maintenance and intuitiveness. If you don’t name things well, as iterations go on—and compatibility layers and more projects pile up—it becomes hard to tell at a glance what a path is for, especially with RPC calls.

I find resource locator naming kind of weird: big companies are surprisingly inconsistent, and there isn’t a great universal standard. Personally, I advocate RESTful naming conventions within the team. Even though everyone may be writing RESTful APIs, some rules aren’t strictly followed, and people don’t always pay attention to this area. Let me share the format I personally prefer.

Format: http(s)://host/{app-name}/{version}/{domain}/{rest-convention}

{app-name}: Marking the name here is convenient for solving cross-origin issues. For example, if the frontend and backend originally communicate via subdomains, cross-origin problems can easily occur. But from what I’ve seen, most consumer-facing product APIs just hardcode 'api'.

{version}: Represents the API version. This is what RESTful recommends—typically v1, v2, v3, etc. {domain}: Used to define a technical/business area. In microservices, this is usually the module’s business name. {rest-convention}: The set of REST endpoints agreed upon under this domain.

Besides following the format, there are also some RESTful-recommended conventions:

  • URL paths should use all lowercase letters
  • When connecting words in a URL, you should use hyphens ( - ), not underscores ( _ )
  • URLs should only locate resources and should not contain verbs (actions should be reflected in the HTTP method)

Here’s an Ele.me API endpoint for checking “foodie beans.” The path naming is very standard—one glance and you can locate the business domain and quickly understand what the API does. But it uses lowerCamelCase in the path; personally I feel that when the path gets long, it becomes slightly less intuitive. That’s just personal preference. I’ve seen many Alibaba-family APIs that really like using uppercase letters in URLs.

GET /restapi/v1/users/supervip/pea/queryAccountBalance?types={}&longitude={}&latitude={}

General class naming

“General classes” are the business and utility classes everyone writes a lot. I think the real value of class naming is making things easy to find and maintain later. When a project grows, modifying requirements from a long time ago can take forever to locate. With good naming habits, you can find things instantly based on your own conventions. Also, if class names are a total mess, when you add new requirements and think about single responsibility, you won’t even know where to put the code—either you shove it somewhere randomly or create a new class. Over time you won’t remember anything, and you’ll spend ages searching again. Worst of all, for coworkers who have to take over your work, it’s a disaster. Here are some habits I’ve summarized.

  1. Enum classes end with "Enum" e.g. GroupAppTypeEnum
  2. Centralized constants interfaces end with "Constants" (defined as an interface) e.g. TopicConstants
  3. Abstract classes start with "Abstract" e.g. AbstractDistributedLock
  4. Custom exception classes end with "Exception" e.g. HeartbeatException
  5. Test classes end with "Test" e.g. StudyRoomInitTest
  6. Controllers end with "Controller" e.g. LearningGroupSubChannelController
  7. Business processing interfaces end with "Service" (not recommended to start with I), implementation classes end with "ServiceImpl" e.g. LearningFeedStoryStatusServiceImpl
  8. For persistence-layer wrappers… endings like "Dao", "Mapper", "Repository" are all fine—mainly depends on consistent project style and team habits. If I’m leading a new project, I’d name MongoDB/Redis/InfluxDB and other NoSQL or multi-storage aggregation accessors with the Repository suffix, and name MySQL/PostgreSQL and other RDBMS accessors with the Dao suffix e.g. AggHealthMongoRepository

I also wrote a simple template—this is basically what I directly apply when naming:

{domain}{subLabel}{businessType}{classType}

{domain}: your business domain

{subLabel}: sub-category

{business}: the business object being operated on

{classType}: the role—service, controller, etc.

For example, in a scenario I’m working on now: I have massive amounts of data to compute, and it needs to be real-time, so I added an intermediate layer—a dedicated service for aggregating data. There are several categories of aggregated data, and each category has different types. If I write a business implementation class, I might name it:

/**
agg table aggregation business, health indicates the health type of the aggregation business
HeartRate indicates the heart-rate business under health, ServiceImpl indicates the implementation class of business logic
**/
AggHealthHeartRateServiceImpl

Sometimes there isn’t an obvious hierarchy between business areas—then just use a single domain name. If the project is smaller, you can have one constants class or enum per module. The domain scope can be expanded or narrowed depending on the class role. Naming should be tightly coupled to the business and follow single responsibility.

Entity class naming

Entity classes involve a lot of “anti-lowerCamel” naming. Alibaba requires all-uppercase acronyms, but I don’t want to follow that here—I think XxxVo looks better than XxxVO. Common sense says you can write Xml, so why can’t VO be Vo? Luckily my coworkers think the same way, so we happily use lowerCamel acronyms in our code.

image-20211005015318338

Alibaba conventions

Here are my personal habits:

  • Request body parameter objects end with Request e.g. UserLearningArchiveRequset
  • Response/view objects end with Vo e.g. LearningGroupNoteVo
  • Transport-layer objects end with Dto e.g. LearningGroupUpgradeDto
  • ES entity classes end with Do e.g. ChatGroupInfoDo
  • Mongo entity classes end with Doc e.g. LearningGroupInfoDoc
  • DB entity classes use UpperCamelCase of the table name (mainly to work nicely with MyBatis) e.g. TUser

Overall, just follow the business. If it’s complex and easy to mix up, you can add a domain prefix to scope it—but I think you should try to include a domain as much as possible. Longer names are fine; what you want to avoid is ambiguity and future trouble. For example, for a group upgrade DTO, at first you name it UpgradeDto; later you have team upgrades and group upgrades, so you change it to GroupUpgradeDto; later you have learningGroup and chatGroup upgrades, so you change it again to LearningGroupUpgradeDto.

The funniest (and most realistic) situation is often like this: you start with UpgradeDto for group upgrades; later you add squad upgrades and want to use UpgradeDto, but it already exists. You stare at it for a while trying to remember what it is, then realize it’s for group upgrades, and to avoid conflict you create a new teamGroupUpgradeDto and move on. Later during maintenance, you see UpgradeDto and teamGroupUpgradeDto—they’re actually parallel concepts, but you can’t tell from the names. This confuses maintainers and can even make them afraid to touch the code.

Variable naming

In a project, bad variable naming is a nightmare for future maintainers. Even worse… you yourself can forget. Other things like classes, methods, and comments can help understanding, but variables rarely have comments. I don’t think anyone enjoys reading code like that. I once discussed whether code is for humans or machines with a former coworker. My view has always been clear: code is for humans. Machines don’t understand your pre-compiled code anyway, and you don’t understand the compiled/interpreted code that actually runs either.

I really dislike the i/j/k naming style from some C-like languages. Even in demos and coding practice, I refuse to use single-letter names—I try to make names meaningful. I’ve seen many Go veterans still stick to single-letter naming… “save what you can,” I guess. I don’t get it….

A common pattern for variable names is using nouns (phrases) or adjectives: nouns for normal variables, adjectives for booleans.

Here are a few tips to help you write good variable names:

  • Keep naming consistent across the project

Within the same project, try to keep variable names consistent for the same meaning. For example, both sum and total can represent a total, but once you use sum, don’t switch to total in the same scenario.

  • Keep plural style consistent

If you have a LIst<Student>, how would you name it? Both students and studentList work. I prefer the latter, because later when you use it, you can tell from the name what the container is—studentSet, etc.—without hovering with the mouse. Of course this is personal preference; the key is consistency. Don’t have both students and studentList at the same time, or you’ll confuse others (and your future self).

  • Don’t start booleans with is

This is mentioned in Alibaba’s convention manual. Even though I’ve used this naming before, it’s better to follow widely accepted conventions—one day you might step on a landmine. So how do you name booleans? I personally like replacing is with bool, e.g. isAlive becomes boolAlive.

image-20211005212332972

Explanation in Alibaba’s convention manual
  • Don’t fear long names; avoid abbreviations

This was something I struggled with early on: full names are long, abbreviations aren’t clear. My approach now is: only abbreviate when it’s a widely accepted abbreviation; otherwise, write it out. For example, if you write cnt, in my head I’ll see content, context, count, contrast flying by—way too ambiguous. Here’s a table of common abbreviations:

Full name Abbrev
identification id
average avg
maximum max
minimum min
error err
message msg
image img
length len
library lib
password pwd

Method naming

When writing Java business logic, most of what you do is calling methods—your own or someone else’s. If you debug and glance at the call stack, there can easily be hundreds of methods, so method names are crucial. If the name is straightforward, readers don’t need to read all the code—they can just scan the main flow and quickly get what the method is doing.

Method naming is easy and hard at the same time. Easy because there’s a fixed pattern: usually a verb + noun combination. Hard because the verb needs to be precise and the name needs to be professional.

{verb}{noun} fetchUserList()

Here are a few tips to help you write good method names:

  • Verbs must be precise

For example, a method called addCharacter() looks fine at first glance: it “adds a character” to a string. But does it add to the beginning, or append to the end? You can’t tell the real intent from the name—you have to read the implementation. But if I call it appendCharacter(), doesn’t it immediately feel more precise?

Pick good verbs and your method names become much easier to understand. Here’s a verb table:

Category Words
Add/Insert/Create/Init/Load add、append、insert、create、initialize、load
Delete/Destroy delete、remove、destroy、drop
Open/Begin/Start open、start
Close/Stop close、stop
Get/Read/Find/Query get、fetch、acquire、read、search、find、query
Set/Reset/Put/Write/Release/Refresh set、reset、put、write、release、refresh
Send/Push send、push
Receive/Pull receive、pull
Submit/Revoke/Cancel submit、cancel
Collect/Pick/Select/Choose collect、pick、select
Extract/Parse sub、extract、parse
Encode/Decode encode、decode
Fill/Pack/Compress fill、pack、compress
Flush/Clear/Unpack/Decompress flush、clear、unpack、decompress
Increase/Decrease increase、decrease、reduce
Split/Join/Concat split、join、concat
Filter/Validate/Check filter、valid、check
  • Nouns must be professional

The verb defines the action, and the noun defines the object being operated on. For nouns, try to use domain vocabulary. There aren’t that many commonly used words anyway—don’t go to a translator to find obscure words; it’s really unnecessary. Also try to follow common habits: for example, collection length is usually size, arrays and strings usually use length. Don’t reinvent the wheel by using size to represent string length. Here’s a noun table:

Category Words
Capacity/Size/Length capacity、size、length
Instance/Context instance、context
Configuration config、settings
Header/Front/Previous/First header、front、previous、first
Tail/Back/Next/Last tail、back、next、last
Range/Interval/Region/Area/Section/Scope/Scale range、interval、region、area、section、scope、scale
Cache/Buffer/Session cache、buffer、session
Local/Global local、global
Member/Element member、element
Menu/List menu、list
Source/Destination/Target source、destination、target
  • Single responsibility

After Java 8 added lambdas, it started leaning a bit toward FP (functional programming). Now when operating on collections, using Stream operators can be super smooth—you just remember the operator names and don’t need to dive into implementations to know what they do. Java also encourages this style: encapsulate specific operations into methods, give them good names, and you’re done. Traditionally we extracted methods only when we had shared code, but now any standalone logic can be extracted. Then your main flow only contains method calls, and readers can quickly understand what you’re doing.

The most important part is how you split methods, and not doing logic that doesn’t match the method name. Don’t cram “do this and also do that” into one method, and don’t keep stuffing more and more into it during maintenance. That’s a nightmare for whoever takes over.

  • Make good use of DTOs

For example, when writing CRUD code: for a “query students” API, the product initially wants to query by age, so you write getStudentByAge; later they want to query by age and city, so you write getStudnetByAgeAndCity; then they want age + city + gender, so you write getS... Stop. If you name things like this, your nightmare begins.

A better approach in my opinion: if you query by primary key, you can use By to connect the query. But if you query by other attributes and there may be many combinations later, then you should encapsulate it: getStudents(StudentSearchDto searchParam).

Finally

Finally, I suggest that when you’re writing code, don’t be afraid to spend time on naming, and don’t be afraid to spend time summarizing. Good naming can save you a lot of refactoring later—so promise me you’ll make names “self-explanatory,” okay?

All articles in this blog, unless otherwise stated, are licensed under @Oreoft . Please indicate the source when reprinting!

Table of Contents