My Secret Life as a Spaghetti Coder
home | about | contact | privacy statement | getting started with cfrails
Automatic download of libraries and packages
I'd like to see automatic downloads of libraries/packages from IDEs.

The idea came to me a couple of weeks ago on twitter.

What if our programming environment automatically checked for libraries?

Suppose I decide I want to use ActiveRecord, but I don't have it installed on my system. Wouldn't it be nice to just type

class entity < ActiveRecord
...
end

and have IntelliJ IDEA (or Notepad++ or SciTE or Emacs or ...) just download it for me?

I need an MP3 library. Instead of seeing that there's no file to load, wouldn't it be great if the editor tried to find it?

No such file to load.

All that needs to happen is that we have an index that checks for includes and references. If it doesn't find it in the standard library or any installed libraries, it goes to the index to find possible matches. If there's only one, it downloads it and continues. If more than one exists, it might ask you which one you want to download and include.

This isn't limited to Ruby. In fact, I'd love it more in Java and .NET. I can't count the number of times on those platforms where I've looked up how to do something, only to be denied by the fact they didn't mention the appropriate package or namespace to use.

Automatic Parallel Programming
Around the same time, I also thought it would be nice to have compilers and interpreters decide when concurrency would be appropriate:

What if our runtimes automatically split into several processes when it was safe to do so?

This can get really tricky. In fact, we don't really want it to be automagical detection. There are some cases where it could happen, as far as I can tell. However, it's not worth the apprehension we'd feel if we didn't know when the compiler or interpreter was going to do so.

But there are plenty of cases where it is possible. I've been in several of them lately. Even forgetting about those -- Instead of typing the boilerplate to make it happen, I really want something almost automagic:

parallelize {
  some_op_over_most_of_a_set
  merge_parts { |part| do_something_with(part); }
}


Not all operations would require a merge operation. But overall, wouldn't it be excellent?



The first is certainly possible. The second may require some research and work.

What do you think? Pipedream, or possible? So who's in?

Hey! Why don't you make your life easier and subscribe to the full post or short blurb RSS feed? I'm so confident you'll love my smelly pasta plate wisdom that I'm offering a no-strings-attached, lifetime money back guarantee!


Comments
Leave a comment

I have to agree that the automatic download would be a superb feature. Of course only if the downloaded packages are to be trusted. For Java-based platforms I could think of some kind of built in Maven tool.

As for the parallelization issue: the closest thing I know of is Microsofts Parallel LINQ which does exactly that: operate on a set of items and merge them together:

http://msdn.microsoft.com/en-us/magazine/cc163329....

However, the runtime of any language with functional features like list comprehensions could be smart enough to parallelize automatically. E.g., for Ruby, the .each method could be parallelized easily if the runtime knows that there are no write operations to variables outside the block scope.

nd

Posted by nd on Aug 13, 2008 at 01:02 AM UTC - 6 hrs

I know what you mean - this is the way of life on Linux, at least on systems that use "apt".

Still, the idea is carried a lot further there. This doesn't only work for language libraries, but the entire system, top to bottom.

Posted by Corey Furman on Aug 13, 2008 at 07:30 AM UTC - 6 hrs

I've had the same idea about auto-download libraries, I think. If so, "apt" and "maven" are totally inadequate comparisons.

The idea is literally blurring the line between writing in a language and extending the language. The language is a single entity out there, of which you merely have parts stored locally. The local storage is totally automatic and transparent. You should never need to know a jar file name, just the package names.

The main complication is version control. If someone updates a package, my code should not just break suddenly. I should be informed first. If I don't want the new version, I should be able to stick with the old version. I should also be able to try the new version, and roll back if I want to.

To minimize these kinds of problems, there should be 2-way visibility. If others are using my package, I should be able to get details about those dependencies, and which versions they're using. If they checked in unit tests, I should be able to run their unit tests easily so that I can have some assurance that my updates didn't break their code. And I should be able to deprecate versions of my packages to accumulate no new users, and existing users get notified of the deprecation.

Another complication is licensing. It's easiest if we just assume that the whole thing is open-source freeware, but that may not be realistic in the long run. What safe, easy thing happens when you import a package from a commercial library?

Anyway, that's a rough sketch of the idea. Whoever implements it will have to start simple. There is nothing like it yet. Maven and Ruby gems fall woefully short of this vision. Apt is a different thing.

Great idea, Sam :-)

Posted by awh on Aug 13, 2008 at 09:25 AM UTC - 6 hrs

There are ways to parallelize ops, using a map/reduce paradigm. it's what google uses. It's really, really rad.

Posted by Dave on Aug 13, 2008 at 09:40 AM UTC - 6 hrs

Yeah, you're right - don't know what I was thinking. I really shouldn't code and think at the same time - at least until I've had a cup of coffee :)

Posted by Corey Furman on Aug 13, 2008 at 10:06 AM UTC - 6 hrs

@nd - Regarding #each, that was my initial thought too. But how often do you see something like "break if this is the first line" because for instance, it contains column names and you don't want to perform the same ops on it as you would the rest of the data. Likewise, I think you need a merge condition in addition to the edge conditions.

@awh - sounds like you've explained my thoughts better than I did. =) Thanks for the comments. Do I see a project with your name on it coming up?

@Corey - well, as you said, "the idea is carried a lot further here." In essence, I think it can be described in one phrase as "automagic package management," but as awh mentioned, it can be carried further and the challenges are plenty.

@Dave - thanks for bringing up mapreduce. If anyone's interested, Google published a paper on it at http://labs.google.com/papers/mapreduce.html

Posted by Sammy Larbi on Aug 15, 2008 at 10:00 AM UTC - 6 hrs

There are a handful of solutions to your second problem.

The most interesting is probably OpenMP http://en.wikipedia.org/wiki/OpenMP

There are a handful of explicitly parallel languages out there too... Parallel Haskell? Unified Parallel C? But nobody uses them.

Posted by Joe Auricchio on Aug 19, 2008 at 12:39 PM UTC - 6 hrs

Leave a comment

Leave this field empty
Your Name
Email (not displayed, more info?)
Website

Comment:

Subcribe to this comment thread
Remember my details
Google
Web CodeOdor.com

Me
Picture of me

Topics
.NET (26)
AI/Machine Learning (15)
Bioinformatics (3)
C++ (7)
cfrails (22)
ColdFusion (84)
Customer Relations (20)
Databases (2)
DRY (19)
DSLs (13)
Electronics (2)
Future Tech (6)
Games (8)
Groovy/Grails (8)
Hardware (2)
IDEs (10)
Java (44)
JavaScript (5)
Lisp (2)
Mac OS (3)
Management (4)
Miscellany (63)
OOAD (39)
Programming (132)
Programming Quotables (9)
Rails (21)
Ruby (59)
Save Your Job (63)
scriptaGulous (4)
Software Development Process (28)
TDD (43)
TDDing xorblog (6)
Tools (6)
Web Development (8)
YAGNI (12)

Resources
Agile Manifesto & Principles
Principles Of OOD
ColdFusion
CFUnit
Ruby
Ruby on Rails
JUnit



RSS 2.0: Full Post | Short Blurb
Subscribe by email:

Delivered by FeedBurner