Posted by Sam on Apr 09, 2008 at 07:34 AM UTC - 5 hrs
Something's been bothering me lately. It's nothing, really. ?, ?, null, nil, or whatever you want to call it. I think we've got it backwards in many cases. Many languages like to throw errors when you try to use nothing as if it were something else - even if it's nothing fancy.
I think a better default behavior would be to do nothing - at most log an error somewhere, or allow us a setting - just stop acting as if the world came to an end because I *gasp* tried to use null as if it were a normal value.
In fact, just because it's nothing, doesn't mean it can't be something. It is something - a concept at the minimum. And there's nothing stopping us from having an object that represents the concept of nothing.
More...
Exceptions should be thrown when something exceptional happens. Maybe encountering a null value was at some time, exceptional. But in today's world of databases, form fields, and integrated disparate systems, you don't know where your data is at or where it's coming from - and encountering null is the rule, not the exception.
Expecting me to write paranoid code and add a check for null to avoid every branch of code where it might occur is ludicrous. There's no reason some sensible default behavior can't be chosen for null, and if I really need something exceptional to happen, I can check for it.
Really, aren't you sick of writing code like this:
string default = "";
if(form["field"] != null and boolFromDBSaysSetIt != null
and boolFromDBSaysSetIt)
default = form["field"];
when you could be writing code like this:
if(boolFromDBSaysSetIt)
default = form["field"];
I think this is especially horrid for conditional checks. When I write if(someCase) it's really just shorthand for if(someCase eq true). So why, when
someCase is null or not a boolean should it cause an error? It's not
true, so move on - don't throw an error.
Someone tell me I'm wrong. It feels like I should be wrong. But it feels worse to have the default path always be the one of most resistance.
What do you think?
Hey! Why don't you make your life easier and subscribe to the full post
or short blurb RSS feed? I'm so confident you'll love my smelly pasta plate
wisdom that I'm offering a no-strings-attached, lifetime money back guarantee!
Last modified on Apr 09, 2008 at 07:39 AM UTC - 5 hrs
Posted by Sam on Feb 13, 2008 at 08:44 AM UTC - 5 hrs
One step back from greatness lies the very definition of the impossible leadership situation:
a president affiliated with a set of established commitments that have in the course of
events been called into question as failed or irrelevant responses to the problems of the day...
The instinctive political stance of the establishment affiliate -- to affirm and continue the
work of the past -- becomes at these moments a threat to the vitality, if not survival,
of the nations, and leadership collapses upon a dismal choice. To affirm established
commitments is to stigmatize oneself as a symptom of the nation's problems and the premier
symbol of systemic political failure; to repudiate them is to become isolated from one's most
natural political allies and to be rendered impotent.
A little while ago Obie asked " What's this crap about a Ruby backlash?" The whole situation has reminded me of Skowronek's work, so I dug a couple of passages up.
We're at a crossroads right now between two regimes - one represented by Java, and the other represented by Ruby (although it is quite a bit more nuanced than that). My belief right now is that Java The Language is in a position where it can't win. People are fed up with the same old crap, and a change is happening (see also: Why Do I Have To Tell The Compiler Twice?, or Adventures in Talking To a Compiler That Doesn't Listen.)
More...
What these [reconstructive] presidents did, and what their predecessors could not do, was to
reformulate the nation's political agenda altogether, ... and to move the nation past the old
problems, eyeing a different set of possibilities... (Skowronek, pg. 38)
When the new regime starts gaining momentum, in the old regime there will be wailing and gnashing of teeth. We can see some of this in the dogma repeated by Ruby's detractors alluded to (but not sourced) by Daniel Spiewak. We hear it in the fear in people's comments when they fail to criticize the ideas, relying instead on ad hominem attacks that have little to nothing to do with the issues at hand.
(Unlike Obie, I don't have any reason to call attention to anyone by name. If you honestly haven't seen this, let's try i don't like ruby, ruby sucks, and ruby is slow and see if we can weed through the sarcasm, apologists who parrot the line so as not to offend people, or just those exact words with no other substance. )
There will also fail to be unity.
Java is considering closures, and some want multiline string literals. There's talk that Java is done and that there should be a return to the good ol' days.
Neal Gafter quotes himself and Joshua Bloch in Is Java Dying? (where he concludes that it isn't):
Neal Gafter: "If you don't want to change the meaning of anything ever, you have no choice but to not do anything. The trick is to minimize the effect of the changes while enabling as much as possible. I think there's still a lot of room for adding functionality without breaking existing stuff..."
Josh Bloch: "My view of what really happens is a little bit morbid. I think that languages and platforms age by getting larger and clunkier until they fall over of their own weight and die very very slowly, like over ... well, they're all still alive (though not many are programming Cobol anymore). I think it's a great thing, I really love it. I think it's marvelous. It's the cycle of birth, and growth, and death. I remember James saying to me [...] eight years ago 'It's really great when you get to hit the reset button every once and a while.'"
To me, the debate is starting to look a lot like the regime change Skowronek's work predicts when going from a vulnerable establishment regime where an outsider reconstructs a new one.
I'm not saying Ruby itself will supplant Java. But it certainly could be a piece of the polyglot programming puzzle that will do it. It's more of an overall paradigm shift than a language one, so although I say one part is represented by Java and another by Ruby, I hope you won't take me literally.
Franklin Roosevelt was the candidate with "clean hands" at a moment when failed policies,
broken promises, and embarrassed clients were indicting a long-established political order.
Agitating for a rout in 1932, he inveighed against the entire "Republican leadership." He
denounced them as false prophets of prosperity, charged them with incompetence in dealing with
economic calamity, and convicted them of intransigence in the face of social desperation.
Declaring their regime morally bankrupt, he campaigned to cut the knot, to raise a new standard,
to restore to American government the ancient truths that had first inspired it.
(Skowronek, pg 288)
Hoover's inability to take the final step in innovation and
repudiate the system he was transforming served his critic's well... Hoover would later
lament the people's failure to appreciate the significance of his policies, and yet he was
the first to deny it. The crosscurrents of change in the politics of leadership left him with
an impressive string of policy successes, all of which added up to one colossal political
failure... Hoover sought to defend a system that he had already dispensed with...
(Skowronek, pg. 284-285)
Which one sounds like which paradigm?
Last modified on Feb 13, 2008 at 08:45 AM UTC - 5 hrs
Posted by Sam on Jan 14, 2008 at 06:42 AM UTC - 5 hrs
This is a story about my journey as a programmer, the major highs and lows I've had along the way, and
how this post came to be. It's not about how ecstasy made me a better programmer, so I apologize if that's why you came.
In any case, we'll start at the end, jump to
the beginning, and move along back to today. It's long, but I hope the read is as rewarding as the write.
A while back,
Reg Braithwaite
challenged programing bloggers with three posts he'd love to read (and one that he wouldn't). I loved
the idea so much that I've been thinking about all my experiences as a programmer off and on for the
last several months, trying to find the links between what I learned from certain languages that made
me a better programmer in others, and how they made me better overall. That's how this post came to be.
More...
The experiences discussed herein were valuable in their own right, but the challenge itself is rewarding
as well. How often do we pause to reflect on what we've learned, and more importantly, how it has changed
us? Because of that, I recommend you perform the exercise as well.
I freely admit that some of this isn't necessarily caused by my experiences with the language alone - but
instead shaped by the languages and my experiences surrounding the times.
One last bit of administrata: Some of these memories are over a decade old, and therefore may bleed together
and/or be unfactual. Please forgive the minor errors due to memory loss.
How QBASIC Made Me A Programmer
As I've said before, from the time I was very young, I had an interest in making games.
I was enamored with my Atari 2600, and then later the NES.
I also enjoyed a playground game with Donald Duck
and Spelunker.
Before I was 10, I had a notepad with designs for my as-yet-unreleased blockbuster of a side-scrolling game that would run on
my very own Super Sanola game console (I had the shell designed, not the electronics).
It was that intense interest in how to make a game that led me to inspect some of the source code Microsoft
provided with QBASIC. After learning PRINT, INPUT,
IF..THEN, and GOTO (and of course SomeLabel: to go to)
I was ready to take a shot at my first text-based adventure game.
The game wasn't all that big - consisting of a few rooms, the NEWS
directions, swinging of a sword against a few monsters, and keeping track of treasure and stats for everything -
but it was a complete mess.
The experience with QBASIC taught me that, for any given program of sufficient complexity, you really only
need three to four language constructs:
- Input
- Output
- Conditional statements
- Control structures
Even the control structures may not be necessary there. Why? Suppose you know a set of operations will
be performed an unknown but arbitrary amount of times. Suppose also that it will
be performed less than X number of times, where X is a known quantity smaller than infinity. Then you
can simply write out X number of conditionals to cover all the cases. Not efficient, but not a requirement
either.
Unfortunately, that experience and its lesson stuck with me for a while. (Hence, the title of this weblog.)
Side Note: The number of language constructs I mentioned that are necessary is not from a scientific
source - just from the top of my head at the time I wrote it. If I'm wrong on the amount (be it too high or too low), I always appreciate corrections in the comments.
What ANSI Art taught me about programming
When I started making ANSI art, I was unaware
of TheDraw. Instead, I opened up those .ans files I
enjoyed looking at so much in MS-DOS Editor to
see how it was done. A bunch of escape codes and blocks
came together to produce a thing of visual beauty.
Since all I knew about were the escape codes and the blocks (alt-177, 178, 219-223 mostly), naturally
I used the MS-DOS Editor to create my own art. The limitations of the medium were
strangling, but that was what made it fun.
And I'm sure you can imagine the pain - worse than programming in an assembly language (at least for relatively
small programs).
Nevertheless, the experience taught me some valuable lessons:
- Even though we value people over tools, don't underestimate
the value of a good tool. In fact, when attempting anything new to you, see if there's a tool that can
help you. Back then, I was on local BBSs, and not
the 1337 ones when I first started out. Now, the Internet is ubiquitous. We don't have an excuse anymore.
-
I can now navigate through really bad code (and code that is limited by the language)
a bit easier than I might otherwise have been able to do. I might have to do some experimenting to see what the symbols mean,
but I imagine everyone would.
And to be fair, I'm sure years of personally producing such crapcode also has
something to do with my navigation abilities.
-
Perhaps most importantly, it taught me the value of working in small chunks and
taking baby steps.
When you can't see the result of what you're doing, you've got to constantly check the results
of the latest change, and most software systems are like that. Moreover, when you encounter
something unexpected, an effective approach is to isolate the problem by isolating the
code. In doing so, you can reproduce the problem and problem area, making the fix much
easier.
The Middle Years (included for completeness' sake)
The middle years included exposure to Turbo Pascal,
MASM, C, and C++, and some small experiences in other places as well. Although I learned many lessons,
there are far too many to list here, and most are so small as to not be significant on their own.
Therefore, they are uninteresting for the purposes of this post.
However, there were two lessons I learned from this time (but not during) that are significant:
-
Learn to compile your own $&*@%# programs
(or, learn to fish instead of asking for them).
-
Stop being an arrogant know-it-all prick and admit you know nothing.
As you can tell, I was quite the cowboy coding young buck. I've tried to change that in recent years.
How ColdFusion made me a better programmer when I use Java
Although I've written a ton of bad code in ColdFusion, I've also written a couple of good lines
here and there. I came into ColdFusion with the experiences I've related above this, and my early times
with it definitely illustrate that fact. I cared nothing for small files, knew nothing of abstraction,
and horrendous god-files were created as a result.
If you're a fan of Italian food, looking through my code would make your mouth water.
DRY principle?
Forget about it. I still thought code reuse meant copy and paste.
Still, ColdFusion taught me one important aspect that got me started on the path to
Object Oriented Enlightenment:
Database access shouldn't require several lines of boilerplate code to execute one line of SQL.
Because of my experience with ColdFusion, I wrote my first reusable class in Java that took the boilerplating away, let me instantiate a single object,
and use it for queries.
How Java taught me to write better programs in Ruby, C#, CF and others
It was around the time I started using Java quite a bit that I discovered Uncle Bob's Principles of OOD,
so much of the improvement here is only indirectly related to Java.
Sure, I had heard about object oriented programming, but either I shrugged it off ("who needs that?") or
didn't "get" it (or more likely, a combination of both).
Whatever it was, it took a couple of years of revisiting my own crapcode in ColdFusion and Java as a "professional"
to tip me over the edge. I had to find a better way: Grad school here I come!
The better way was to find a new career. I was going to enter as a Political Scientist
and drop programming altogether. I had seemingly lost all passion for the subject.
Fortunately for me now, the political science department wasn't accepting Spring entrance, so I decide to
at least get started in computer science. Even more luckily, that first semester
Venkat introduced me to the solution to many my problems,
and got me excited about programming again.
I was using Java fairly heavily during all this time, so learning the principles behind OO in depth and
in Java allowed me to extrapolate that for use in other languages.
I focused on principles, not recipes.
On top of it all, Java taught me about unit testing with
JUnit. Now, the first thing I look for when evaluating a language
is a unit testing framework.
What Ruby taught me that the others didn't
My experience with Ruby over the last year or so has been invaluable. In particular, there are four
lessons I've taken (or am in the process of taking):
-
The importance of code as data, or higher-order functions, or first-order functions, or blocks or
closures: After learning how to appropriately use
yield, I really miss it when I'm
using a language where it's lacking.
-
There is value in viewing programming as the construction of lanugages, and DSLs are useful
tools to have in your toolbox.
-
Metaprogramming is OK. Before Ruby, I used metaprogramming very sparingly. Part of that is because
I didn't understand it, and the other part is I didn't take the time to understand it because I
had heard how slow it can make your programs.
Needless to say, after seeing it in action in Ruby, I started using those features more extensively
everywhere else. After seeing Rails, I very rarely write queries in ColdFusion - instead, I've
got a component that takes care of it for me.
-
Because of my interests in Java and Ruby, I've recently started browsing JRuby's source code
and issue tracker.
I'm not yet able to put into words what I'm learning, but that time will come with
some more experience. In any case, I can't imagine that I'll learn nothing from the likes of
Charlie Nutter, Ola Bini,
Thomas Enebo, and others. Can you?
What's next?
Missing from my experience has been a functional language. Sure, I had a tiny bit of Lisp in college, but
not enough to say I got anything out of it. So this year, I'm going to do something useful and not useful
in Erlang. Perhaps next I'll go for Lisp. We'll see where time takes me after that.
That's been my journey. What's yours been like?
Now that I've written that post, I have a request for a post I'd like to see:
What have you learned from a non-programming-related discipline that's made you a better programmer?
Last modified on Jan 16, 2008 at 07:09 AM UTC - 5 hrs
Posted by Sam on Dec 24, 2007 at 04:52 PM UTC - 5 hrs
Suppose for the purposes of our example we have string the_string of length n, and we're trying to determine if string the_substring of length m is found within the_string.
The straightforward approach in many languages would be to use a find() or indexOf() function on the string. It might look like this:
More...
substring_found_at_index = the_string.index(the_substring)
However, if no such method exists, the straightforward approach would be to just scan all the substrings and compare them against the_substring until you find a match. Even if the aforementioned function exists, it likely uses the same strategy:
def find_substring_pos(the_string, the_substring)
(0..(the_string.length-1)).each do |i|
this_sub = the_string[i,the_substring.length]
return i if this_sub == the_substring
end
return nil
end
That is an O(n) function, which is normally fast enough.
Even though I'm one of the guys who camps out in line so I can be one of the first to say "don't prematurely optimize your code," there are situations where the most straightforward way to program something just doesn't work. One of those situations is where you have a long string (or set of data), and you will need to do many comparisons over it looking for substrings. Although you'll find it in many cases, an example of the need for this I've seen relatively recently occurs in bioinformatics, when searching through an organism's genome for specific subsequences. (Can you think of any other examples you've seen?)
In that case, with m much smaller than a very large n, O(m * log n) represents a significant improvement over O(n) (or worst case m*n). We can get there with a suffix array.
Of course building the suffix array takes some time - so much so that if we had to build it for each comparison, we're better off with the straightforward approach. But the idea is that we'll build it once, and reuse it many times, amortizing the cost out to "negligible" over time.
The idea of the suffix array is that you store every suffix of a string (and it's position) in a sorted array. This way, you can do a binary search for the substring in log n time. After that, you just need to compare to see if the_substring is there, and if so, return the associated index.
The Wikipedia page linked above uses the example of "abracadabra." The suffix array would store each of these suffixes, in order:
a
abra
abracadabra
acadabra
adabra
bra
bracadabra
cadabra
dabra
ra
racadabra
Below is an implementation of a suffix array in Ruby. You might want to write a more efficient sort algorithm, as I'm not sure what approach Enumerable#sort takes. Also, you might want to take into account the
ability to get all substrings, not just the first one to be found.
class SuffixArray
def initialize(the_string)
@the_string = the_string
@suffix_array = Array.new
#build the suffixes
last_index = the_string.length-1
(0..last_index).each do |i|
the_suffix = the_string[i..last_index]
the_position = i
# << is the append (or push) operator for arrays in Ruby
@suffix_array << { :suffix=>the_suffix, :position=>the_position }
end
#sort the suffix array
@suffix_array.sort! { |a,b| a[:suffix] <=> b[:suffix] }
end
def find_substring(the_substring)
#uses typical binary search
high = @suffix_array.length - 1
low = 0
while(low <= high)
mid = (high + low) / 2
this_suffix = @suffix_array[mid][:suffix]
compare_len = the_substring.length-1
comparison = this_suffix[0..compare_len]
if comparison > the_substring
high = mid - 1
elsif comparison < the_substring
low = mid + 1
else
return mid
end
end
return nil
end
end
sa = SuffixArray.new("abracadabra")
puts sa.find_substring("ac") #outputs 3
Thoughts, corrections, and improvements are always appreciated.
Last modified on Dec 24, 2007 at 04:57 PM UTC - 5 hrs
Posted by Sam on Dec 03, 2007 at 06:32 AM UTC - 5 hrs
It's not a hard thing to come up with, but it's incredibly useful. Suppose you need to
iterate over each pair of values or indices in an array. Do you really want to
duplicate those nested loops in several places in your code? Of course not. Yet
another example of why code as data is such a powerful concept:
More...
class Array
# define an iterator over each pair of indexes in an array
def each_pair_index
(0..(self.length-1)).each do |i|
((i+1)..(self.length-1 )).each do |j|
yield i, j
end
end
end
# define an iterator over each pair of values in an array for easy reuse
def each_pair
self.each_pair_index do |i, j|
yield self[i], self[j]
end
end
end
Now you can just call array.each_pair { |a,b| do_something_with(a, b) }.
Posted by Sam on Nov 22, 2007 at 12:04 PM UTC - 5 hrs
Since the gift buying season is officially upon us, I thought I'd pitch in to the rampant consumerism and list some of the toys I've had a chance to play with this year that would mean fun and learning for the programmer in your life. Plus, the thought of it sounded fun.
Here they are, in no particular order other than the one in which I thought of them this morning:
More...
- JetBrains' IntelliJ IDEA: An awesome IDE for Java. So great, I don't mind spending the $249 (US) and using it over the free Eclipse. The Ruby plugin is not too shabby either, the license for your copy is good for your OSX and Windows installations, and you can try it free for 30 days. Martin Fowler thinks IntelliJ changed the IDE landscape. If you work in .NET, they also have ReSharper, which I plan to purchase very soon. Now if only we could get a ColdFusion plugin for IntelliJ, I'd love it even more.
- Programming Ruby, Second Edition: What many in the Ruby community consider to be Ruby's Bible. You can lower the barrier of entry for your favorite programmer to using Ruby, certainly one of the funner languages a lot of people are loving to program in lately. Sometimes, I sit and think about things to program just so I can do it in Ruby.
If they've already got that, I always like books as gifts. Some of my
favorites from this year have been: Code Complete 2, Agile Software Development: Principles, Patterns, and Practices which has a great section on object oriented design principles, and of course,
My Job Went to India.
I have a slew of books I've yet to read this year that I got from last Christmas (and birthday), so I'll have to
list those next year.
-
Xbox 360 and a subscription to
XNA Creator's Club (through Xbox Live Marketplace - $99 anually) so they can deploy their games to their new Xbox. This is without a
doubt the thing I'd want most, since I got into this whole programming thing because I was interested
in making games. You can point them to the
getting started page, and they could
make games for the PC for free, using XNA (they'll need that page to get started anyway, even if you
get them the 360 and Creator's Club membership to deploy to the Xbox).
-
MacBook Pro and pay for the extra pixels. I love mine - so much so,
that I intend to marry it. (Ok, not that much, but I have
been enjoying it.)
The extra pixels make the screen almost as wide as two, and if you can get them an extra monitor I'd do
that too. I've moved over to using this as my sole computer for development, and don't bother with
the desktops at work or home anymore, except on rare occasions. You can run Windows on it, and the
virtual machines are getting really good so that you ought not have to even reboot to use either
operating system.
Even if you don't want to get them the MacBook, a second or third monitor should be met with enthusiasm.
-
A Vacation: Programmers are notoriously working long hours
and suffering burnout, so we often need to take a little break from the computer screen. I like
SkyAuction because all the vacations are package deals, there's often a good variety to choose from (many
different countries), most of the time you can find a very good price, and usually the dates are flexible
within a certain time frame, so you don't have to commit right away to a certain date.
Happy Thanksgiving to those celebrating it, and thanks to all you who read and comment and set me straight when I'm wrong - not just here but in the community at large. I do appreciate it.
Do you have any ideas you'd like to share (or ones you'd like to strike from this list)?
Last modified on Nov 22, 2007 at 12:04 PM UTC - 5 hrs
Posted by Sam on Nov 07, 2007 at 01:39 PM UTC - 5 hrs
I just wanted to give a quick shout out to the IntelliJ IDEA Ruby plugin team for working so fast to get a fix out the door.
I had posted a question on the JRuby Development list about running Ruby unit tests against JRuby from within IntelliJ IDEA using the Ruby plugin. A couple of days went by and one of the developers of the plugin contacted me, worked with me on solving my problem, and released a new version that supported what I needed within another couple of days.
That's awesome service.
Posted by Sam on Nov 07, 2007 at 07:33 AM UTC - 5 hrs
The following was generated using a 7th order Markov chain and several of my blog posts as source text:
More...
For the monetary hurdles? A computer to prove anything on you.
If you are serving with talented coders so it doesn't change. Can you understand the case study present in different paradigm shifts in science. I was on vacation, "I can't see my old search terms every time you run into a code monkey? Are you a code monkey? Are you stick to standards so you could have implement our own develop good sound-editor to do it. Even more diplomatic terms, of course of action - if a proxy) the objectives and different approach much better, and how can you believe it?" I think it would solve those domain you're willing to him speak than after a little over a million users so it doesn't "mean that a competitors. Of course of action - if a proxy) the objectives and different paradigms to help out in the to-read list: The Power of integration in asking what other hand, JSF, Seam, and Tapestry are based on my to-read list.) Because I think you can keep the view updated - but there are made, and borders on absurd side of the elements in the code for you to record an improvement. That's not a far leap to see that you do?" Here he actually pointless. How do I know - I was doing before I started to deploy to XBOX 360 or PC (Developers created." Instead, you should use is a contest to avoid conversations. It can be threatened." What are the Maximizers are drive-in traffic and implementation of an old professor of mine, be honest about it. Thanks XNA!
As always came back in January. Perhaps you understand it. If you want to conversations. It can be threatened." What are the number of integration in asking people something to do it. Even more diplomatic terms, of course, you can do. In particular, by posting a good developers created." Instead, you should.
The point is that you're not passion.
There are a couple of repeats noticeable, which is mostly due to my lack of source text. At least, it worked a little better using one-hundred 8.5" x 11" pages of Great Expectations (still gibberish though).
Anyway, here's the Ruby source code:
class MarkovModel
def create_model(file, order, unit)
#unit = "word" or "character"
@unit = unit
entire_file = ""
@order = order
#read the entire file in
File.open(file, "r") do |infile|
cnt=0
while (line = infile.gets)
entire_file += line
cnt+=1
end
end
#split the file according to characters or words
if unit == "word"
split_on = /\s/
else
split_on = ""
end
@entire_file_split = entire_file.split(split_on)
#construct a hash like:
#first 'order' letters, letter following, increment count
@model = Hash.new
i = 0
@entire_file_split.each do |c|
this_group = @entire_file_split[i,order]
next_letter = @entire_file_split[i+order,1]
#if group doesn't exist, create it
if @model[this_group] == nil
@model[this_group] = { next_letter => 1 }
#if group exists, but this "next_letter" hasn't been seen, insert it
elsif @model[this_group][next_letter] == nil
@model[this_group][next_letter] = 1
#if group and next letter exist in model, increment the count
else
cur_count_of_next_letter = @model[this_group][next_letter] + 1
@model[this_group][next_letter] = cur_count_of_next_letter
end
i+=1
end
end
def generate_and_print_text(amount)
start_group = @entire_file_split[0,@order]
print start_group
this_group = start_group
(0..(amount-@order)).each do |i|
next_letters_to_choose_from = @model[this_group]
#construct probability hash
num = 0
probabilities = {}
next_letters_to_choose_from.each do |key, value|
num += value
probabilities[key] = num
end
#select next letter
index = rand(num)
matches = probabilities.select {|key, value| index <= value }
sorted_by_value = matches.sort{|a,b| a[1]<=>b[1]}
next_letter = sorted_by_value[0][0]
print " " if @unit == "word" #if we're splitting on words
print next_letter
#shift the group
this_group = this_group[1,@order-1] + next_letter.to_ary
end
end
def print_model
require 'pp'
PP.pp(@model)
end
end
file = File.expand_path "source_text.txt"
mm = MarkovModel.new
mm.create_model(file, 7, "character")
mm.generate_and_print_text(2000)
mm.create_model(file, 1, "word")
mm.generate_and_print_text(250)
#mm.print_model
Last modified on Nov 07, 2007 at 07:39 AM UTC - 5 hrs
Posted by Sam on Oct 31, 2007 at 04:26 PM UTC - 5 hrs
When looping over collections, you might find yourself needing elements that match only a certain
parameter, rather than all of the elements in the collection. How often do you see something like this?
foreach(x in a)
if(x < 10)
doSomething;
Of course, it can get worse, turning
into arrow code.
More...
What we really need here is a way to filter the collection while looping over it. Move that extra
complexity and indentation out of our code, and have the collection handle it.
In Ruby we have each
as a way to loop over collections. In C# we have foreach, Java's got
for(ElemType elem : theCollection), and Coldfusion has <cfloop collection="#theCollection#">
and the equivalent over arrays. But wouldn't it be nice to have an each_where(condition) { block; } or
foreach(ElemType elem in Collection.where(condition))?
I thought for sure someone would have implemented it in Ruby, so I was surprised at first to see this in my
search results:
However, after a little thought, I realized it's not all that surprising: it is already incredibily easy to filter
a collection in Ruby using the select method.
But what about the other languages? I must confess - I didn't think long and hard about it for Java or C#.
We could implement our own collections such that they have a where method that returns the subset we are
looking for, but to be truly useful we'd need the languages' collections to implement where
as well.
Of these four languages, ColdFusion provides both the need and opportunity, so I gave it a shot.
First, I set up a collection we can use to exercise the code:
<cfset beer = arrayNew(1)>
<cfset beer[1] = structNew()>
<cfset beer[1].name = "Guiness">
<cfset beer[1].flavor = "full">
<cfset beer[2] = structNew()>
<cfset beer[2].name = "Bud Light">
<cfset beer[2].flavor = "water">
<cfset beer[3] = structNew()>
<cfset beer[3].name = "Bass">
<cfset beer[3].flavor = "medium">
<cfset beer[4] = structNew()>
<cfset beer[4].name = "Newcastle">
<cfset beer[4].flavor = "full">
<cfset beer[5] = structNew()>
<cfset beer[5].name = "Natural Light">
<cfset beer[5].flavor = "water">
<cfset beer[6] = structNew()>
<cfset beer[6].name = "Boddington's">
<cfset beer[6].flavor = "medium">
Then, I exercised it:
<cfset daytypes = ["hot", "cold", "mild"]>
<cfset daytype = daytypes[randrange(1,3)]>
<cfif daytype is "hot">
<cfset weWantFlavor = "water">
<cfelseif daytype is "cold">
<cfset weWantFlavor = "full">
<cfelse>
<cfset weWantFlavor = "medium">
</cfif>
<cfoutput>
Flavor we want: #weWantFlavor#<br/><br/>
Beers with that flavor: <br/>
<cf_loop collection="#beer#" item="aBeer" where="flavor=#weWantFlavor#">
#aBeer.name#<br/>
</cf_loop>
</cfoutput>
Obviously, that breaks because don't have a cf_loop tag. So, let's create one:
<!--- loop.cfm --->
<cfparam name="attributes.collection">
<cfparam name="attributes.where" default = "">
<cfparam name="attributes.item">
<cfif thistag.ExecutionMode is "start">
<cfparam name="isDone" default="false">
<cfparam name="index" default="1">
<cffunction name="_getNextMatch">
<cfargument name="arr">
<cfloop from="#index#" to="#arrayLen(arr)#" index="i">
<cfset keyValue = attributes.where.split("=")>
<cfset index=i>
<cfif arr[i][keyValue[1]] is keyValue[2]>
<cfreturn arr[i]>
</cfif>
</cfloop>
<cfset index = arrayLen(arr) + 1>
<cfexit method="exittag">
</cffunction>
<cfset "caller.#attributes.item#" = _getNextMatch(attributes.collection,index)>
</cfif>
<cfif thistag.ExecutionMode is "end">
<cfset index=index+1>
<cfset "caller.#attributes.item#" = _getNextMatch(attributes.collection,index)>
<cfif index gt arrayLen(attributes.collection)>
<cfset isDone=true>
</cfif>
<cfif not isDone>
<cfexit method="loop">
<cfelse>
<cfexit method="exittag">
</cfif>
</cfif>
It works fine for me, but you might want to implement it differently. The particular area of improvement I
see right away would be to utilize the item name in the where attribute. That way,
you can use this on simple arrays and not just assume arrays of structs.
Thoughts anybody?
Last modified on Oct 31, 2007 at 04:29 PM UTC - 5 hrs
Posted by Sam on Oct 04, 2007 at 04:00 PM UTC - 5 hrs
A couple of weeks ago the UH Code Dojo embarked on the
fantastic voyage that is writing a program to solve Sudoku puzzles, in Ruby. This week, we
continued that journey.
Though we still haven't completed the problem (we'll be meeting again tenatively on October 15, 2007 to
do that), we did construct what we think is a viable plan for getting there, and began to implement some
of it.
The idea was based around this algorithm (or something close to it):
More...
while (!puzzle.solved)
{
find the most constrained row, column, or submatrix
for each open square in the most constrained space,
find intersection of valid numbers to fill the square
starting with the most constrained,
begin filling in open squares with available numbers
}
With that in mind, we started again with TDD.
I'm not going to explain the rationale behind each peice of code, since the idea was presented above.
However, please feel free to ask any questions if you are confused, or even if you'd just like
to challenge our ideas.
Here are the tests we added:
def test_find_most_constrained_row_column_or_submatrix
assert_equal "4 c", @solver.get_most_constrained
end
def test_get_available_numbers
most_constrained = @solver.get_most_constrained
assert_equal [3,4,5], @solver.get_available_numbers(most_constrained).sort
end
def test_get_first_empty_cell_from_most_constrained
most_constrained = @solver.get_most_constrained
indices = @solver.get_first_empty_cell(most_constrained)
assert_equal [2,4], indices
end
def test_get_first_empty_cell_in_submatrix
indices = @solver.get_first_empty_cell("0 m")
assert_equal [0,2], indices
end
And here is the code to make the tests pass:
def get_most_constrained
min_open = 10
min_index = 0
@board.each_index do |i|
this_open = @board[i].length - @board[i].reject{|x| x==0}.length
min_open, min_index = this_open, i.to_s + " r" if this_open < min_open
end
#min_row = @board[min_index.split(" ")]
@board.transpose.each_index do |i|
this_open = @board.transpose[i].length - @board.transpose[i].reject{|x| x==0}.length
min_open, min_index = this_open, i.to_s + " c" if this_open < min_open
end
(0..8).each do |index|
flat_subm = @board.get_submatrix_by_index(index).flatten
this_open = flat_subm.length - flat_subm.reject{|x| x==0}.length
min_open, min_index = this_open, index.to_s + " m" if this_open < min_open
end
return min_index
end
def get_available_numbers(most_constrained)
index, flag = most_constrained.split(" ")[0].to_i,most_constrained.split(" ")[1]
avail = [1,2,3,4,5,6,7,8,9] - @board[index] if flag == "r"
avail = [1,2,3,4,5,6,7,8,9] - @board.transpose[index] if flag == "c"
avail = [1,2,3,4,5,6,7,8,9] - @board.get_submatrix_by_index(index).flatten if flag == "m"
return avail
end
def get_first_empty_cell(most_constrained)
index, flag = most_constrained.split(" ")[0].to_i,most_constrained.split(" ")[1]
result = index, @board[index].index(0) if flag == "r"
result = @board.transpose[index].index(0), index if flag == "c"
result = index%3*3,index*3+@board.get_submatrix_by_index(index).flatten.index(0) if flag == "m"
return result
end
Obviously we need to clean out that commented-out line, and
I feel kind of uncomfortable with the small amount of tests we have compared to code. That unease was
compounded when we noticed a call to get_submatrix instead of get_submatrix_by_index.
Everything passed because we were only testing the first most constrained column. Of course, it will
get tested eventually when we have test_solve, but it was good that the pair-room-programming
caught the defect.
I'm also not entirely convinced I like passing around the index of the most constrained whatever along
with a flag denoting what type it is. I really think we can come up with a better way to do that, so
I'm hoping that will change before we rely too much on it, and it becomes impossible to change without
cascading.
Finally, we also set up a repository for this and all of our future code. It's not yet open to the public
as of this writing (though I expect that
to change soon). In any case, if you'd like to get the full source code to this, you can find our Google Code
site
at http://code.google.com/p/uhcodedojo/.
If you'd like to aid me in my quest to be an idea vacuum (cleaner!), or just have a question, please feel
free to contribute with a comment.
Last modified on Oct 04, 2007 at 04:12 PM UTC - 5 hrs
Posted by Sam on Sep 25, 2007 at 06:39 AM UTC - 5 hrs
The last bit of advice from Chad Fowler's 52 ways to save your job was to be a generalist, so this week's version is the obvious opposite: to be a specialist.
The intersection point between the two seemingly disparate pieces of advice is that you shouldn't use your lack of experience in multiple technologies to call yourself a specialist in another. Just because you
develop in Java to the exclusion of .NET (or anything else) doesn't make you a Java specialist. To call yourself that,
you need to be "the authority" on all things Java.
More...
Chad mentions a measure he used to assess a job candidate's depth of knowledge in Java: a question of how to make the JVM crash.
I'm definitely lacking in this regard. I've got a pretty good handle on Java, Ruby, and ColdFusion. I've done a small amount of work in .NET and have been adding to that recently. I can certainly write a program that will crash - but can I write one to crash the virtual
machine (or CLR)?
I can relunctantly write small programs in C/C++, but I'm unlikely to have the patience to trace through a large program for fun. I might even still be able to figure out some assembly language if you gave me enough time. Certainly in these lower level items it's not hard to find a way to crash. It's
probably harder to avoid it, in fact.
In ColdFusion, I've crashed the CF Server by simply writing recursive templates (those that cfinclude themselves). (However, I don't know if that still works.) In Java and .NET, I wouldn't know where to start. What about crashing a browser with JavaScript?
So Chad mentions that you should know the internals of JVM and CLR. I should know how JavaScript works in the browser and not just how to getElementById(). With that in mind, these things are going on the to-learn list - the goal being to find a way to crash each of them.
Ideas?
Last modified on Sep 25, 2007 at 06:41 AM UTC - 5 hrs
Posted by Sam on Sep 19, 2007 at 01:37 PM UTC - 5 hrs
A couple of days ago the UH Code Dojo met once again (we took the summer off). I had come in wanting to
figure out five different ways to implement binary search.
The first two - iteratively and recursively - are easy to come up with. But what about three
other implementations? I felt it would be a good exercise in creative thinking, and pehaps it
would teach us new ways to look at problems. I still want to do that at some point, but
the group decided it might be more fun to tackle to problem of solving any Sudoku board,
and that was fine with me.
Remembering the trouble
Ron Jeffries had in
trying to
TDD a solution
to Sudoku, I was a bit
weary of following that path, thinking instead we might try
Peter Norvig's approach. (Note: I haven't looked
at Norvig's solution yet, so don't spoil it for me!)
More...
Instead, we agreed that certainly there are some aspects that are testable, although
the actual search algorithm that finds solutions is likely to be a unit in itself, and therefore
isn't likely to be testable outside of presenting it a board and testing that its outputted solution
is a known solution to the test board.
On that note, our first test was:
require 'test/unit'
require 'sudokusolver'
class TestSudoku < Test::Unit::TestCase
def setup
@board = [[5, 3, 0, 0, 7, 0, 0, 0, 0],
[6, 0, 0, 1, 9, 5, 0, 0, 0],
[0, 9, 8, 0, 0, 0, 0, 6, 0],
[8, 0, 0, 0, 6, 0, 0, 0, 3],
[4, 0, 0, 8, 0, 3, 0, 0, 1],
[7, 0, 0, 0, 2, 0, 0, 0, 6],
[0, 6, 0, 0, 0, 0, 2, 8, 0],
[0, 0, 0, 4 ,1 ,9 ,0, 0, 5],
[0, 0, 0, 0, 8, 0, 0, 7, 9]]
@solution =[[5, 3, 4, 6, 7, 8, 9, 1, 2],
[6, 7, 2, 1, 9, 5, 3, 4, 8],
[1, 9, 8, 3, 4, 2, 5, 6, 7],
[8, 5, 9, 7, 6, 1, 4, 2, 3],
[4, 2, 6, 8, 5, 3, 7, 9, 1],
[7, 1, 3, 9, 2, 4, 8, 5, 6],
[9, 6, 1, 5, 3, 7, 2, 8, 4],
[2, 8, 7, 4 ,1 ,9 ,6, 3, 5],
[3, 4, 5, 2, 8, 6, 1, 7, 9]]
@solver = SudokuSolver.new
end
def test_solve
our_solution = @solver.solve(@board)
assert_equal(@solution, our_solution)
end
end
We promptly commented out the test though, since we'd never get it to pass until we were done. That
doesn't sound very helpful at this point. Instead, we started writing tests for testing the validity
of rows, columns, and blocks (blocks are what we called the 3x3 submatrices in a Sudoku board).
Our idea was that a row, column, or block is in a valid state
if it contains no duplicates of the digits 1 through 9. Zeroes (open cells) are acceptable in
a valid board. Obviously, they are not acceptable in a solved board.
To get there, we realized we needed to make initialize take the initial game board
as an argument (so you'll need to change that in the TestSudokuSolver#setup method and
SudokuSolver#solve, if you created it), and then we added the following tests (iteratively!):
def test_valid_row
assert @solver.valid_row?(0)
end
def test_valid_allrows
(0..8).each do |row|
assert @solver.valid_row?(row)
end
end
The implementation wasn't difficult, of course. We just need to reject all zeroes from the row, then run
uniq! on the resulting array. Since uniq! returns nil if each
element in the array is unique, and nil evaluates to false, we have:
class SudokuSolver
def initialize(board)
@board = board
end
def valid_row?(row_num)
@board[row_num].reject{|x| x==0}.uniq! == nil
end
end
At this point, we moved on to the columns. The test is essentially the same as for rows:
def test_valid_column
assert @solver.valid_column?(0)
end
def test_valid_allcolumns
(0..8).each do |column|
assert @solver.valid_column?(column)
end
end
The test failed, so we had to make it pass. Basically, this method is also the same for columns as it was
for rows. We just need to call Array#transpose# on the board, and follow along. The
valid_column? one-liner was @board.transpose[col_num].reject{|x| x==0}.uniq! == nil.
We added that first, made sure the test passed, and then refactored SudokuSolver
to remove the duplication:
class SudokuSolver
def initialize(board)
@board = board
end
def valid_row?(row_num)
valid?(@board[row_num])
end
def valid_column?(col_num)
valid?(@board.transpose[col_num])
end
def valid?(array)
array.reject{|x| x==0}.uniq! == nil
end
end
All the tests were green, so we moved on to testing blocks. We first tried slicing in two dimensions, but
that didn't work: @board[0..2][0..2]. We were also surprised Ruby didn't have something
like an Array#extract_submatrix method, which assumes it it passed an array of arrays (hey,
it had a transpose method). Instead, we created our own. I came up with some nasty, convoluted code,
which I thought was rather neat until I rewrote it today.
Ordinarily, I'd love to show it as a fine example of how much prettier it could have become. However, due
to a temporary lapse into idiocy on my part: we were editing in 2 editors, and I accidentally saved the
earlier version after the working version instead of telling it not to save. Because of that,
I'm having to re-implement this.
Now that that sorry excuse is out of the way, here is our test for, and implementation of extract_submatrix:
def test_extract_submatrix
assert_equal [[1,2,3],[4,5,6],[7,8,9]].extract_submatrix(1..2,1..2) , [[5,6],[8,9]]
end
class Array
def extract_submatrix(row_range, col_range)
self[row_range].transpose[col_range].transpose
end
end
Unfortunately, that's where we ran out of time, so we didn't even get to the interesting problems Sudoku
could present. On the other hand, we did agree to meet again two weeks from that meeting (instead of a month),
so we'll continue to explore again on that day.
Thoughts? (Especially about idiomatic Ruby for Array#extract_submatrix)
|