libraries considered harmful
libraries considered harmful
Library is shared code that you cannot otherwise name. It is a confused amalgamation of different parts of code, different usages and different business rules. It serves more than a single purpose, and thus; no well defined and clear purpose. It is hard to update, hard to work with, and hard to use.
Many of us get a nasty feeling when some shared code ends up in a package or class called Util. Why is this?
Util classes are not themselves evil or bad, but they encourage misuse and contravention of good principles of design. They very often go against the S in SOLID design; Single responsibility. Their broadly scoped name Util makes them a dumping ground for code that has no other logical place. Even when better places could, and should, exist.
Libraries are much the same; they have a broadly scoped name, but they are arguably worse. They share and distribute your code all over your systems; and often make themselves very hard to unwind. The lack of clear name makes them feel like a safe and logical place to add additional code (that should not leak out of a certain deployable or application).
It’s this additional code; often biz logic, that makes them hard to work with, hard to use, and hard to update. They are too specialised and too generalised at the same time. They achieve both too much and too little.
what should we use instead?
Better names. Naming can be the hardest part of software engineering; it requires creativity and is the first point of communication of what a snippet, fragment, class, package etc does. Anyone who can tell me what any given Util class does (beyond “holds utility functions”) clearly wrote the class. Else they would not know. It is a nothing name, a series of letters that make up a familiar sounding word that conveys no meaning. Or arguably less than no meaning. It’s a misleading name.
As we all agree now that naming is hard, and important. What of libraries?
Libraries should not be called libraries. What even are libraries?
|
|
Clearly the definitions regarding books, buildings and houses are not immediately relevant. Perhaps definition 5 gives us more insight;
“a collection of programs and software packages made generally available”
That is about as descriptive as Util classes!
redefining the names
If we are building and therefore naming shared code packages and repositories then we should be building one of the following:
- Application
- Framework
- Toolkit
If it doesn’t fit into one of those categories then you are building the wrong thing. Lets look at each one:
application
Application is a fairly obvious one, this is probably your deployable or the thing that you are shipping. Does the definition help us?
|
|
Yes, even without the contextual usage about database applications we can understand more about this package than just library! We also understand that this is probably the place that our business related code goes, it’s the logical place for application code to live. It fulfils it’s purpose and doesn’t try to do or be anything else.
framework
Frameworks are supporting structures, they are wrappers.
|
|
This is supporting code, it is not specific to any given application, only to the supporting task that it’s trying to achieve. Think; web frameworks that run servers. They wrap applications and application code and allow you to run it as a web app. What they don’t do is try and dictate the logic of the code you write, at most they dictate the code structure and its external interactions. If you are writing frameworks, ensure they are general enough to fulfil many more usecases than you can image at the time of writing.
toolkit
Toolkits are the most abused. These packages are the closest to Util classes. These are the places you are most likely to dump code that should not be shared. One of the golang proverbs is:
“A small duplication is better than a small dependency”
The definition:
|
|
The non computing definition here is better, the key being “for a particular purpose”. That purpose is not, should not and will not be a business logic purpose (as that is application code!) That purpose is something more general and reusable than that. A good example of a toolkit is the Stanford NLP toolkit It provides NLP tools, parsers, tokenisers, POS taggers etc. These are tools in a kit that fulfil a purpose; their purpose is processing natural language. They do not try to dictate the applications or usecases of the tools, only providing tools that allow and enable the usecases.
If you think you are writing a toolkit; think about if that could be taken out of your business use and used again. If not you are not writing a toolkit! Take a well known company? Uber; part of their systems are to create geo fences. A geofence refers to a human-defined geographic area (or polygon in geometry terms) on the Earth’s surface. The code that defines geofences and interacts with them could be extracted into a toolkit. Code that decides which two points are closest together within a region could be in that toolkit, but not code about riders or drivers! Code that interacts with the geo locations could be in that toolkit but not code that prioritises routes or makes uber business decisions!
As soon as business logic goes into a toolkit you are writing a library, you are writing a Util package, and you are creating a shared code nightmare for yourself!