Posted by Sam on Jul 02, 2009 at 07:30 AM UTC - 5 hrs
This is the "my lawn needs more water and my wife disagrees"
edition of Programming Quotables.
If you don't know - I don't like to have too many microposts on this blog
(I'm on twitter for that), so I save them up as I run across them,
and every once in a while I'll post a few of them. The idea is to post quotes about programming that have
one or more of the following attributes:
I find funny
I find asinine
I find insightfully true
And stand on their own, with little to no comment needed
It's up to you decide which category they fall in, if you care to. Anyway, here we go:
More...
"Is it still Scrum if we use a kanban board instead of a burndown?"
"Is it still XP if we don't break things down into tasks but just use stories?"
If we hope to excite people with the power of programming, we will do it with big ideas, not the placement of periods, spaces, keywords, and braces. We need to find ways so that students can solve problems and write programs by understanding the ideas behind them, using tools that get in the way as little as possible. No junk allowed. That may be through simpler languages, better libraries, or something else that I haven't learned about yet.
It's entirely possible that there's a detente to be reached between the copyists and the copyright holders: a set of rules that only try to encompass "culture" and not "industry." But the only way to bring copyists to the table is to stop insisting that all unauthorized copying is theft and a crime and wrong. People who know that copying is simple, good, and beneficial hear that and assume that you're either talking nonsense or that you're talking about someone else.
Hey! Why don't you make your life easier and subscribe to the full post
or short blurb RSS feed? I'm so confident you'll love my smelly pasta plate
wisdom that I'm offering a no-strings-attached, lifetime money back guarantee!
Posted by Sam on Jun 24, 2009 at 11:39 PM UTC - 5 hrs
Many programmers view piracy as some inevitable righteous result of the coming of the information age.
We justify the theft of music (in particular in this case) in several ways:
Artists benefit because the number of fans increase which sells more tickets and merchandise at shows
For the record, I agree with #3, used to think #1 was the case, and I'm refusing to do business with anyone
I know is a member of ASSHAT ASCAP. I'm very consciously changing my view on #1
after hearing the artists' point of view.
So the world will always have its Billy Joel's, along with its hot shot Internet sensations of the moment.
But what I want to know, again, is who's going to be paying for the recording of all this free music
dreamed up by small-timers (which description, if you're not aware, covers most of us)?
And if the answer is "nobody," that means that music made by people who are even a little bit
outside the mainstream, off the beaten path, or just plain fucking weird, is going to
disappear (save, of course, for those with disposable cash - the hobbyists and the rich kids -
but those people tend to make music that people can't even be bothered to download for free).
Long term, this isn't going to work out well for music fans. I'm not scolding. I'm not trying to stuff
the genie back in the bottle. I just want the emperor's distinct lack of clothing noted: the alleged death
of the music industry is in actuality the death of interesting music.
I said the same thing six years ago when I first wrote about this stuff.
And what's happened since? Music has become more boring than ever, that's what.
Even punk rock has become more mind-numbingly mediocre than I ever remember it being before,
and I came of age when bands like Youth Of Today and Corrosion of Conformity were hot shit
with the dimbulb set.
I haven't seen so many bands playing it safe and copying what's popular since the skulls and
anarchy-symbol craze of 1984. If I hear another band aping the Dillinger Four and shouting
their Cookie Monster vocals at me through their hobo beards I may begin to sob.
At the time I was first bemoaning the effect file-sharing was bound to have on the best music,
I was informed (rather condescendingly, as I recall) that all those fans who were illegally
downloading my music were going to be paying me back in spades via sales of show tickets and merchandise.
And while my showing in that department is dandy at the moment, it certainly hasn't picked up any in
the ensuing six years. In fact, merchandise sales are almost exactly the same. Granted, these days
that's cause to break out the bubbly and perch a lampshade on the old noggin, but I can't help but
notice that this whole file sharing thing hasn't quite worked out as advertised - not just for me,
but for any other working musicians I know either.
Now maybe that's all a big, fat coincidence,
but it's kind of hard to escape the conclusion that what this is, was, and always will be about
is people getting something for nothing. It's not about crazed rock fandom.
It's about as gobbling
up as much free stuff as you can with little regard for what it is or might be, and virtually no
patience at all when it comes to evaluating the goods because, after all, if it's free, what can
it really be worth? And the reason people are doing it isn't because any sort of revolution has occurred;
it's not a consequence of us poor artists having been unshackled from the chains of the evil
record labels and their PR teams and A&R men and distributors and lawyers and accountants.
It's happening because people can steal music easily and without any real risk of getting caught.
That's the way the music business is these days and while I'm not happy about it I'm well aware
there's no point in fighting it. I refuse to allow people to dress it up as something noble when
it's nothing more than simple greed and theft, but don't get the idea that I'm raging against the
dying of the light; I assure you that at this stage the cupboard has been rendered so bare that
record royalties are more or less a moot point.
I'd be happy to give away future albums, settling
for crossing my fingers and praying for the odd licensing deal, if only I could figure out how to
pay for the damned recordings!
(bolding and paragraph additions (for readability) by yours truly)
Posted by Sam on Jun 18, 2009 at 11:33 AM UTC - 5 hrs
Don't encode information into a string like "AAHD09102008BSHC813" and give that gibberish to people. Don't name your project that, don't give that to me as a value or way to identify something, and don't make humans see or interact with that in any form. (If you are generating something similar and parse it with a program in automated fashion, I don't care what you call it.)
Give it a name we can use while communicating with each other and keep the rest of the information in a database. I can look it up if I need to know it.
Do not use file names, folder names, or project names as your as your database. I don't want to be required to scan each item in whatever set you chose and translate it using a lookup table to find what I'm looking for. I don't want to memorize the lookup table either.
Posted by Sam on Jun 17, 2009 at 12:18 PM UTC - 5 hrs
low cou-pling and high co-he-sion n.
A standard bit of advice for people who are learning to design their code better, who want to
write software with intention as opposed to coincidence, often parroted by the advisor
with no attempt to explain the meaning.
Motivation
It's a great scam, don't you think? Someone asks a question about how to design their code,
and we have these two nebulous words to throw back at them: coupling and cohesion.
We even memorize a couple of adjectives that go with the words: low and high.
More...
Cohesion Good. Coupling, Baaaaad!
It's great because it shuts up the newbie who asks the question -- he doesn't want to appear dumb, after all --
and it gets all of those in-the-know to nod their heads in approval. "Yep, that's right. He's got it. +1."
But no one benefits from the exchange. The newbie is still frustrated, while the professional doesn't
give a second thought to the fact that he probably doesn't know what he means. He's just parroting
back the advice that someone gave to him. It's not malicious or even conscious, but nobody is getting smarter
as a result of the practice.
Maybe we think the words are intuitive enough. Coupling means that something is depending on something else, multiple
things are tied together. Cohesion means ... well, maybe the person asking the question heard something about
it in high school chemistry and can recall it has something to do with sticking together.
Maybe they don't know at all.
Maybe, if they're motivated enough (and not that we've done anything to help in that department), they'll look it
up:
Coincidental cohesion (worst)
is when parts of a module are grouped arbitrarily (at random); the parts have no significant relationship (e.g. a module of frequently used functions).
Logical cohesion
is when parts of a module are grouped because they logically are categorised to do the same thing, even if they are different by nature (e.g. grouping all I/O handling routines).
Temporal cohesion
is when parts of a module are grouped by when they are processed - the parts are processed at a particular time in program execution (e.g. a function which is called after catching an exception which closes open files, creates an error log, and notifies the user).
Procedural cohesion
is when parts of a module are grouped because they always follow a certain sequence of execution (e.g. a function which checks file permissions and then opens the file).
Communicational cohesion
is when parts of a module are grouped because they operate on the same data (e.g. a module which operates on the same record of information).
Sequential cohesion
is when parts of a module are grouped because the output from one part is the input to another part like an assembly line (e.g. a function which reads data from a file and processes the data).
Functional cohesion (best)
is when parts of a module are grouped because they all contribute to a single well-defined task of the module
Content coupling (high)
is when one module modifies or relies on the internal workings of another module (e.g. accessing local data of another module).
Therefore changing the way the second module produces data (location, type, timing) will lead to changing the dependent module.
Common coupling
is when two modules share the same global data (e.g. a global variable).
Changing the shared resource implies changing all the modules using it.
External coupling
occurs when two modules share an externally imposed data format, communication protocol, or device interface.
Control coupling
is one module controlling the logic of another, by passing it information on what to do (e.g. passing a what-to-do flag).
Stamp coupling (Data-structured coupling)
is when modules share a composite data structure and use only a part of it, possibly a different part (e.g. passing a whole record to a function which only needs one field of it).
This may lead to changing the way a module reads a record because a field, which the module doesn't need, has been modified.
Data coupling
is when modules share data through, for example, parameters. Each datum is an elementary piece, and these are the only data which are shared (e.g. passing an integer to a function which computes a square root).
Message coupling (low)
is the loosest type of coupling. Modules are not dependent on each other, instead they use a public interface to exchange parameter-less messages (or events, see Message passing).
No coupling
[is when] modules do not communicate at all with one another.
What does it all mean?
The Wikipedia entries mention that "low coupling often correlates with high cohesion" and
"high cohesion often correlates with loose coupling, and vice versa."
However, that's not the intuitive result of simple evaluation, especially on the part of someone who doesn't
know in the first place.
In the context of the prototypical question
about how to improve the structure of code, one does not lead to the other. By reducing coupling, on the face of
it the programmer is going to merge unrelated units of code, which would also reduce cohesion. Likewise, removing
unrelated functions from a class will introduce another class on which the original will need to depend, increasing
coupling.
To understand how the relationships become inversely correlated requires a larger step in logic, where examples
of the different types of coupling and cohesion would prove helpful.
Examples from each category of cohesion
Coincidental cohesion often looks like this:
class Helpers;
class Util;
int main(void) {
where almost all of your code goes here;
return 0;
}
In other words, the code is organized with no special thought as to how it should be organized.
General helper and utility classes,
God Objects,
Big Balls of Mud, and other anti-patterns
are epitomes of coincidental cohesion.
You might think of it as the lack of cohesion: we normally talk about cohesion being a good thing, whereas
we'd like to avoid this type as much as possible.
(However, one interesting property of coincidental cohesion is that even though the code in question should not be stuck together,
it tends to remain in that state because programmers are too afraid to touch it.)
With logical cohesion, you start to have a bit of organization. The Wikipedia example mentions "grouping
all I/O handling routines." You might think, "what's wrong with that? It makes perfect sense." Then consider that
you may have one file:
IO.somelang
function diskIO();
function screenIO();
function gameControllerIO();
While logical cohesion is much better than coincidental cohesion, it doesn't necessarily go far enough in terms
of organizing your code. For one, we've got all IO in the same folder in the same file, no matter what type of
device is doing the inputting and outputting. On another level, we've got functions that handle both input and
output, when separating them out would make for better design.
Temporal cohesion
is one where you might be thinking "duh, of course code that's executed based on some other
event is tied to that event." Especially considering the Wikipedia example:
a function which is called after catching an exception which closes open files,
creates an error log, and notifies the user.
But consider we're not talking about simple the relationship in time. We're really interested in the code's structure.
So to be temporally cohesive, your code in that error handling situation should keep the closeFile,
logError, and notifyUser functions close to where they are used. That doesn't mean
you'll always do the lowest-level implementation in the same file -- you can create small functions that take
care of setting up the boilerplate needed to call the real ones.
It's also important to note that you'll almost never want to implement all of that directly in the catch
block. That's sloppy, and the antithesis of good design. (I say "almost" because I am wary of absolutes, yet I cannot think
of a situation where I would do so.) Doing so violates functional cohesion, which is what we're really
striving for.
Procedural cohesion
is similar to temporal cohesion, but instead of time-based it's sequence-based. These are similar because
many things we do close together in time are also done in sequence, but that's not always the case.
There's not much to say here. You want to keep the definitions of functions that do things together structurally
close together in your code, assuming they have a reason to be close to begin with. For instance,
you wouldn't put two modules of code together if they're not at least logically cohesive to begin with. Ideally,
as in every other type of cohesion, you'll strive for functional cohesion first.
Communicational cohesion
typically looks like this:
some lines of code;
data = new Data();
function1(Data d) {...};
function2(Data d) {...};
some more lines of code;
In other words, you're keeping functions together that work on the same data.
Sequential cohesion
is much like procedural and temporal cohesion, except the reasoning behind it is that functions would
chain together where the output of one feeds the input of another.
Functional cohesion is the ultimate goal.
It's The Single Responsibility Principle [PDF] in
action. Your methods are short and to the point. Ones that are related are grouped together locally in a file.
Even files or classes contribute to one purpose and do it well. Using the IO example from above, you might have
a directory structure for each device, and within it, a class for Input and one for Output. Those would be children
of abstract I/O classes that implemented all but the device-specific pieces of code.
(I use inheritance terminology here only
as a subject that I believe communicates the idea to others. Of course, you don't have to even have inheritance
available to you to achieve the goal of keeping device agnostic code in one locale while keeping the device
specific code apart from it).
Examples from each category of coupling
Content coupling is horrific. You see it all over the place. It's probably in a lot of your code, and
you don't realize it. It's often referred to a violation of encapsulation in OO-speak, and it looks like one
piece of code reaching into another, without regard to any specified interfaces or respecting privacy. The problem
with it is that when you rely on an internal implementation as opposed to an explicit interface, any time that
module you rely on changes, you have to change too:
module A
data_member = 10
end
module B
10 * A->data_member
end
What if data_member was really called num_times_accessed? Well, now you're screwed since you're
not calculating it.
Common coupling
occurs all the time too. The Wikipedia article mentions global variables, but this could be just a member in a class
where two or more functions rely on it if you consider it. It's not as bad when its encapsulated behind an interface,
where instead of accessing the resource directly, you do so indirectly, which allows you to change internal
behavior behind the wall, and keeps your other units of code from having to change every time the shared resource
changes.
An example of external coupling is a program where one part of the code reads a specific file format that
another part of the code wrote. Both pieces need to know the format so when one changes, the other must as well.
unit A
write_csv_format();
end
unit B // in another file, probably
read_csv_format();
end
Control coupling
might look like:
// unit A
function do(what){
if(what == 1) do_wop;
else if (what == 2) ba_ba_da_da_da_do_wop;
}
// unit B
A.do(1);
Stamp coupling (Data-structured coupling)
involved disparate pieces of code touching the same data structure in different ways. For example:
employee = { :age => 24, :compensation=> 2000 }
def age_range(employee)
range = 1 if employee[:age] < 10
range = 2 if employee[:age] > 10 && < 20;
...
return range
end
def compensation_range(employee)
... only relies on employee[:compensation] ...
end
The two functions don't need the employee structure, but they rely on it and if it changes, those two functions
have to change. It's much better to just pass the values and let them operate on that.
Data coupling
is starting to get to where we need to be. One module depends on another for data. It's a typical function call with parameters:
// in module A
B.add(2, 4)
Message coupling
looks like data coupling, but it's even looser because two modules communicate indirectly without ever passing
each other data. Method calls have no parameters, in other words.
No coupling, like Wikipedia says, is when "modules do not communicate at all with one another."
There is no dependency from code A to code B.
Concluding Remarks
So how do we reconcile the thought that "if I separate code to increase functional cohesion, I introduce dependencies
which is going to increase coupling" with the assertion that low coupling and high cohesion go hand in hand? To do that,
you must recognize that the dependencies already exist. Perhaps not at the class level, but they do at the lines of
code level. By pulling them out into related units, you remove the spaghetti structure (if you can call it that)
and turn it into something much more manageable.
A system of code can never be completely de-coupled unless it does nothing. Cohesion is a different story.
I can't claim that your code cannot be perfectly cohesive, but I can't claim that it can. My belief is it
can be very close, but at some point you'll encounter diminishing returns on your quest to make it so.
The key takeaway is to start looking at your code and think about what you can do to improve it as you notice
the relationships between each line you write start take shape.
Comments and corrections(!) are encouraged. What are your thoughts?
Posted by Sam on Jun 09, 2009 at 06:04 AM UTC - 5 hrs
From time to time I like to actually post a bit of code on this programming blog, so here's
a stream-of-conscious (as in "not a lot of thought went into design quality") example that shows how to:
Open Excel, making it invisible (or visible) to the user.
Create a workbook and access individual worksheets
Add data to a cell, or retrieve data from a cell
Add a chart to a worksheet, with constants for various chart types
If you know where I can find the constants for file type numbers, that would be appreciated. Calling SaveAs
without the type seems to use whatever version of Excel you are running, but I'd like to find how to save as
CSV or other formats.
Needless to say, this requires Excel be on the computer that's running the code.
require'win32ole'
xl = WIN32OLE.new("Excel.Application")puts"Excel failed to start"unless xl
xl.Visible =false
workbook = xl.Workbooks.Add
sheet = workbook.Worksheets(1)#create some fake data
data_a =[](1..10).each{|i| data_a.push i }
data_b =[](1..10).each{|i| data_b.push((rand*100).to_i)}#fill the worksheet with the fake data#showing 3 ways to populate cells with values(1..10).eachdo |i|
sheet.Range("A#{i}").Select
xl.ActiveCell.Formula = data_a[i-1]
sheet.Range("B#{i}").Formula = data_b[i-1]
cell = sheet.Range("C#{i}")
cell.Formula ="=A#{i} - B#{i}"end#chart type constants (via http://support.microsoft.com/kb/147803)
xlArea =1
xlBar =2
xlColumn =3
xlLine =4
xlPie =5
xlRadar =-4151
xlXYScatter =-4169
xlCombination =-4111
xl3DArea =-4098
xl3DBar =-4099
xl3DColumn =-4100
xl3DLine =-4101
xl3DPie =-4102
xl3DSurface =-4103
xlDoughnut =-4120#creating a chart
chart_object = sheet.ChartObjects.Add(10, 80, 500, 250)
chart = chart_object.Chart
chart_range = sheet.Range("A1", "B10")
chart.SetSourceData(chart_range, nil)
chart.ChartType = xlXYScatter
#get the value from a cell
val = sheet.Range("C1").Value
puts val
#saving as pre-2007 format
excel97_2003_format =-4143
pwd = Dir.pwd.gsub('/','\\') << '\\'#otherwise, it sticks it in default save directory- C:\Users\Sam\Documents on my system
workbook.SaveAs("#{pwd}whatever.xls", excel97_2003_format)
xl.Quit
The list is not intended to be a "one-size-fits-all" list.
Instead, "the key is to ask challenging questions that enable you to distinguish the smart software
developers from the moronic mandrills." Even still, "for most of the questions in this list there are no
right and wrong answers!"
Keeping that in mind, I thought it would be fun for me to provide my off-the-top-of-my-head answers,
as if I had not prepared for the interview at all. Here's that attempt.
Though I hope otherwise, I may fall flat on my face. Be nice, and enjoy (and help out where you can!).
Last week's answers on Functional Design
had me feeling that way. Luckily, this week we come to technical design - a topic I feel quite a bit stronger on.
More...
What do low coupling and high cohesion mean? What does the principle of encapsulation mean?
Coupling refers to how strongly or loosely components in a system are tied together. You want that to be
low. Cohesion refers to how well the individual parts of a unit of code fit together for a single purpose.
Encapsulation is about containing implementation of code so that outsiders don't need to know how it's works
on the inside. By doing so you can reduce negative effects of coupling.
Reading: Robert C. Martin's SOLID
principles of OOD, which have been linked on this blog since day 1. His book, Agile Software Development: Principles, Patterns, and Practices
is another great resource for this topic. It's short and to the point, and comes highly recommended from myself.
How do you manage conflicts in a web application when different people are editing the same data?
Set a flag when someone starts editing data unit A. If someone else loads it, let them know it's being edited
and that it's currently in read only mode. If the race was too fast, you can also have a check on the
commit side to let them know their changes conflict with another user, present them the data, and then let
them figure out how to merge the changes. This has rarely been a problem in my experience, but it could be,
and that's how I'd deal with it if the requirement came up. (If the changes don't conflict, you could
simply keep the user unaware as well.)
My answer comes from the things you see in normal usage of shared files or just about any shared resource,
for that matter. Originally,
it comes down to race conditions, so you might
be able to extrapolate some useful information from that low-level explanation.
Do you know about design patterns? Which design patterns have you used, and in what situations?
I know about design patterns. Most of the ones I'm familiar with, at least in the canonical book,
aren't of much daily use to me, as I tend to work in dynamic languages, where the sorts of flaws that precipitate
the patterns (as implemented in the book) just aren't factors as often as in other languages. (Yes, some of the
book is implemented in Smalltalk. I can implement them with as much superfluous junk as you desire in
any language - that doesn't make it a necessity.)
I suppose most frequently I've used the Strategy pattern.
(Perhaps the fact that I've focused so much on one in particular is a weakness in my coding style?) The situations
are when an interface should remain the same while the implementation should differ somewhat. I don't have a
concrete example on the top of my head.
If I were to start working in Java again, or building larger applications in .NET (I currently build very small
apps in that space as part of my job), I'd re-read the book. I might even scan the inner cover daily just as a
refresher.
I wouldn't say I'm strong on design patterns, but I've got reference information and know where to look
should I need to, along with the facilities to become strong should my situation call for it.
Do you know what a stateless business layer is? Where do long-running transactions fit into that picture?
I hadn't heard it as a single term until now, but knowing the individual terms lets me say that objects in
the business layer (or domain model) are transient - or that their state is not preserved in memory between
subsequent requests for the same object.
This may note bode well for long running transactions, as state presumably must be set up each time an
object is loaded, along with any process that might be required for tear-down.
For reading, this is just information as I've come across it throughout my various readings, so I don't
know what to recommend.
What kinds of diagrams have you used in designing parts of an architecture, or a technical design?
UML, or some bastardization of it has always been enough.
Most likely the bastardized part where we just do a little design on paper or a whiteboard to gain a
better understanding of the intended design through some sketch-work.
I've never been a part of a team that practices BDUF, nor
have I felt the need for it in any personal projects, so I'm light on recommendations for reading.
The Wikipedia article on UML is
sufficient for my tastes, but I've know people who dove into Martin Fowler's books
and came away more knowledgeable, so that may help you.
Can you name the different tiers and responsibilities in an N-tier architecture?
For what value of N? (I mean, we could have N=1000000 and I wouldn't know -- or if I did know, we might
be here all day.) Normally N=3, so we might be talking about presentation, logic, and data tiers. Sometimes
we might talk about Entities and others, or we might be considering (mistakenly?) MVC.
I think the responsibilities are clear by their names, but if you'd like to discuss further, I'm
certainly okay with doing so.
Can you name different measures to guarantee correctness and robustness of data in an architecture?
I need a little direction here. It seems to me this is a product of many things, and I don't know where to start.
For instance, we could say that unit tests and integration tests can go part of the way there. We could
talk about validating user input, and that it matches some definition of "looking correct." We could
have checks coded and in place between the various systems that make up the architecture. Constraints on the
database. I could go on if I were giving myself more time to think about it.
Because of the open-endedness in this question, there are any number of references. I'd dive into
automated testing in its various forms, which when applied to the situation, should get you most of the
way there.
Can you name any differences between object-oriented design and component-based design?
To be honest, this is the first I've heard of component-based design, so no, I can't name the differences.
My thoughts would go towards having objects to design around (as in C++) vs. not having objects to
design around (as in C).
As it happens, there may be a reason the term "component-based design" seems new to me: IEEE held the
"1st ... workshop" on it not 6 months ago. They could very well be behind the times.
Searching with Google also indicates this may be designing from the view of the outside,
as in SOA.
I think the SOLID principles I mentioned above go beyond the availability of objects-proper, so I don't expect
to be surprised here. However, I can't offer you any reading advice and without a definitive source
from the Google results, I cannot even tell if I'm in the right ballpark.
Your thoughts are especially encouraged on this topic.
How would you model user authorization, user profiles and permissions in a database?
I wouldn't typically model the authorization piece in the DB. If I read you correctly, I'm
guessing you mean the storage of authorization information in the database, as opposed to the
act of authorizing. Under that assumption, I've modeled this situation in just about every
way I can imagine. A couple of scenarios:
a. Under a denormalized scenario, I'd keep a table of permissions and a table of users (which includes authorization
information, profile information, and a list of permissions from the users table). This isn't ideal if permissions
ever change, and especially not if you're returning a ton of users for the purpose of authorization while the
profile information is especially large. In that case you're transferring way more data than you need, and it could
result in performance problems. (The extra data transfer may only be a problem with ORM tools, as you
could always hand-write the queries to return only what you need.
On the other hand, storing of redundant data is a problem if storage space itself is an issue.)
b. Under a completely normalized scenario, we'd have a table of permissions, a table relating users
to permissions, and a table for users. For the sake of cohesion (and potentially optimizing data transfer)
we might separate the users table into one for authentication and another for profile, while keeping the
relationship with permissions based on user_auth.
c. Some variation in between the two extremes.
For reading? For me it's based on experience, and perhaps a couple of database courses in college. I guess
just about any book on database design would do. I wouldn't bother trying to understand the formal
academic descriptions of database normalization,
but if you want to, you can only be better for it (as long as you can recognize the tradeoffs due to extra
joins!) Reader suggestions are highly welcome, as always.
How would you model the animal kingdom (with species and their behavior) as a class system?
This one might deserve a blog post all on its own. It depends: If I'm working in a language with
multiple inheritance, I'd use a combination of class hierarchy that follows the animal kingdom along
with mixins (which are also inheritance, but with less of a hierarchical attitude) for behavior shared
between and among the hierarchy levels. Without multiple inheritance, I'd have to resort to
interfaces where available, and composition for actual code reuse where it made sense.
The short answer though, is that I probably wouldn't implement it as a class system. If I really was working
with taxonomy and biological classification,
I don't think I'd model the real world with objects. I'd need to look into the subject quite a bit further
to tell you how I would do it, but suffice to say I don't think it'd be using objects to match
it one-for-one, or even something resembling one-to-one.
Reading: I wouldn't know where to begin. The SOLID principles will guide you, but I wouldn't think that's all
there is to it.
What do you think? Where would your answers differ?