Posted by Sam on Mar 13, 2013 at 08:45 AM UTC - 5 hrs
I recently made a proof-of-concept to get MS Word to edit documents in a Rails app and save them back to the server, using Devise for authentication. It wasn't straightforward, so I thought I'd mention the steps I took to get it done.
I used Rails 3.2.12 with rack_dav providing the WebDAV functionality.
There's also dav4rack which was based on rack_dav, and adds a few features.
I started with rack_dav, and had all of the kinks worked out except for the authentication, when I noticed dav4rack included authentication. So I switched to dav4rack, but I had some trouble with it, so I switched back to rack_dav because I feared I'd get into another day-or-two-long yak shaving session.
You might check it out though, and see if it works just as well.
Anyway, here's the basis of what I did.
1. Add the gem to my Gemfile (and run bundle install):
2. Mount it in my routes.rb, wrapped with basic authentication:
webdav = RackDAV::Handler.new(:root => 'private/docs/', :resource_class => LockableFileResource)
webdav_with_auth = Rack::Auth::Basic.new(webdav) do |username, password|
user = User.find_for_authentication(:email => username)
user && user.valid_password?(password)
mount webdav_with_auth, at: "/webdav_docs"
3. Create my lockable file resource, which doesn't really do any locking (since this was just a PoC), so you probably want to fix that:
4. Tell the browser to instruct Word to open the file: For users on the web, we don't want the default rendering of the WebDAV root, because we'd like these docs to be editable in Word, which means you have to tell Word to open it. While you could add this to your own resource collection, I just created an action to list the links to the files with the code to open it in Word. Here's the basis of what you need in the view:
Users who have Office 2010 installed on their system will also have those 2 plugins installed on their system, so we try to use them and return false (preventing the browser from following the link) if they have it. For those who don't, the JS provides a way to just download the file. See OpenDocuments Control API and FFWinPlugin Plug-in API for more info about what you can do with those plugins.
5. Tell Rails not to parse params for our mounted rack_dav app. For whatever reason, Word will try to save the file back to the server, PUTting it with the mime type application/xml, but it doesn't wrap in XML. I don't suppose it's supposed to, but it would be nice if it played well with Rails. Unfortunately, Rails understandably barfs when you tell it to parse parameters from XML and didn't send any XML. So, I had to create a patch and stuck it in an initializer. All we do is figure out if the request is for where we mounted rack_dav: if so, let rack_dav handle it without parsing the params. If not, just parse the params as normal:
old_call = instance_method(:call)
define_method(:call) do |env|
if env["PATH_INFO"] =~ /\/webdav_docs\//
There may be some security problems with this if rack_dav parses request parameters like Rails did before they put out their security updates in January and February 2013. I plan to review the rack_dav code to make sure it doesn't expose the same problems Rails had before I put this in production, and I'd recommend you do the same before you do.
6. Tell Devise not to authenticate OPTIONS or PROPFIND requests. This is needed if you redirect your root url to a protected controller/action, because when Word makes requests using these methods, you'll get redirected and asked for authentication on a web page, which totally breaks the whole concept. This is probably in your application_controller.rb, unless you moved it. So if you moved it, you'll need to make the change there.
class ApplicationController < ActionController::Base
before_filter :authenticate_user!, :unless=>:options_request?
request.method == "OPTIONS" || request.method == "PROPFIND"
That's all. Feel free to ask any questions if something is unclear or not working for you!
Hey! Why don't you make your life easier and subscribe to the full post
or short blurb RSS feed? I'm so confident you'll love my smelly pasta plate
wisdom that I'm offering a no-strings-attached, lifetime money back guarantee!
Posted by Sam on Mar 03, 2013 at 10:14 AM UTC - 5 hrs
Someone from our last houston.rb meetup asked me about contributing to open source software, so I wrote an email that talks about the little things I do. I thought others might find it helpful, so I'm also posting the email here. It is below:
It occurred to me I totally ignored your question about open source, so apologies for that,
and here I go attempting to rectify it.
Any time I'm using an open source project I haven't used before
(or using a new-to-me feature of it), I am reading the docs about it as I
implement whatever it is I'm using it for. If I find something unclear, or
something unaddressed, I fork the project, improve the docs to mention it,
and send a pull request.
Examples: How to use init.d services with rvm,
apartment code sample
Any time I am using an open source project and have need of functionality that could
conceivably be provided by the project, if it's general enough I think it would be of use to
others, I'll implement it in that project instead of mine, and send a pull request. Depending
on the project, I might ask them if they'd be interested first.
Examples: Rails table migration generator,
Spree refactoring for reuse
Now, I don't tend to contribute a lot to any one project. If you're more interested in that,
I would guess you ought to do something similar, but focus it all in that one or two projects
that really excite you. Go through the issue tracker, see if you can reproduce and fix the bugs
people are reporting (or leave a comment telling them how to fix there problem if it's not really
a bug), etc.
For example, Steve Klabnik wrote up a how-to contribute to Rails,
which talks a little more on the human side of things, as opposed to just
the project contribution guidelines.
I think his blog post can be generalized to other projects as well, so it's worth a read to get an idea of how to go about interacting with bigger open source projects when you want to contribute.
Lots of projects will mention their own contribution guidelines though, so make sure you read them and follow them!
That link is to the slides and notes on each slide mentioning a little about what I said.
Basically I went through a couple of different classes of performance issues you're likely to see on the server side of things, what tools you might use to make finding the issue easier, how you can interpret some of the data, and even a suggestion here or there about what to do to resolve common issues I've seen.
Let me know what you think! (Or, if you have any questions, feel free to ask!)
Posted by Sam on Jan 22, 2013 at 06:24 AM UTC - 5 hrs
A greedy algorithm is an algorithm that follows the problem solving heuristic of
making the locally optimal choice at each stage with the hope of finding a global
optimum. In many problems, a greedy strategy does not in general produce an optimal
solution, but nonetheless a greedy heuristic may yield locally optimal solutions
that approximate a global optimal solution in a reasonable time.
What does a greedy algorithm look like?
Images from Wikipedia illustrate the concept well. In the first one,
we begin our search of the space for a maximum at point A. If we're greedy, the next
highest point takes us along a path that maximizes at point (lower case) m. However, the
optimal solution is at point (upper case) M.
If you're greedily searching (again, for a maximum) in three-dimensional space, and start in
a poor position (like near the left in the image below), you'll never make it off the shorter hill
onto the taller one:
What does a greedy algorithm for yourself look like?
Perhaps we should first define what it is for which we're optimizing: To keep it
simple, I'm going to identify it as maximizing income.
You could also be seeking to minimize the distance (in time) between working for clients,
or minimizing the time you spend acquiring each customer -- maximization is not
an intrinsic property of optimization.
I purposefully exclude things like time and happiness (or optimizing for some
combination of all three). While I value those things and would like to think I've
tried to optimize for them, upon closer inspection I recently realized I've tended
to simply accept whatever side-effect optimizing for money has had on those values.
I think it's a fair assumption that plenty of other people are doing the same. If you're
not one of them: congrats! But before you dismiss the idea entirely, you ought to consider
that you might be, just to be sure.
So what does it look like? Again from Wikipedia, I like the idea of moving from node to node
on a graph:
At each point in time (a vertex on the graph) we are presented with a series of options whose positive
return is the value at that vertex. The options available to us are connected by the edges.
(You might also assign a cost to each edge, and change your greedy-optimization heuristic to be
return minus cost).
In the graphic I've used, only two options are available at each point, but in reality there
are normally a lot more options from which to choose, and maybe even times when you have only
If we're starting at 7 and being greedy in our search for a maximum, we miss out on the global maximum.
You might mention -- hey, we could backtrack and search the entire space. But you might not
have the resources available once you've gone down the path to 6. You might have reached your
maximum workload by that point, and by the time you free up some time, the opportunity for the 99
How do you know if you're running your business like a greedy search algorithm?
Here are some symptoms:
Taking the first job offer or client you get when you need work
If you're a freelancer, filling all your time with client work
Taking the first business model that seems to be working and trying to maximize it
Can you think of any others?
Prior to last year, I was executing a totally greedy algorithm with regards to my career.
But it's not just something I've noticed in myself: almost every job I've had
utilized greedy methods for making money.
As part of my greedy search (which was not just about maximizing money, but more towards minimizing
unemployment, even if it meant working for people who would end up stiffing me on paychecks),
I ended up burning through about four months of savings (I had six). Last year
though, I started making some changes to a better approach. When I made my way back up
to a six month cushion and took time to reflect on some of the changes that got me there,
I realized the parallels to algorithms I'd learned about in college.
Improving your odds
How can you improve your chances of finding a global (or at least higher) maximum?
In the general problem, given enough time and resources, you could do an exhaustive search -- or in our
graph model, visit every node.
But I don't have the resources to exhaustively search the entire space to produce an
optimal solution, and I'm betting you don't either.
Instead of straight greedy hill-climbing, we could explore
My favorite here is called "shotgun hill climbing," where instead of following a path all the way
to it's maximum, we'll randomly restart (but we don't forget where we came from on
those restarts, in case we don't end up in a better position).
It's what I like about Amy Hoy's stacking bricks
metaphor. A freelancer can start taking time between gigs (or not remain fully engaged) and spend
some time developing products. Each successive product is a new brick, and over time
you can build a wall.
Savings and product revenue are like ammunition for your shotgun hill climbing business
algorithm. They allow you to explore different starting positions, as long as you're not
being greedy with minimizing the distance (in time) between clients or booking all your time
to work for others.
In practice, all those ideas you have floating around in your head (or ideas.txt file)
are the different options you have in paths to take (in addition to other jobs, clients,
ideas for "upselling", etcetera). You have to spend a significant piece of time on one
to find out if it gets you further up the hill or not -- but not so much so that you can't
get back to your last local maximum or randomly try another.
(Side note: I'm not advocating strictly random
here -- you can apply some heuristic to affect the "probability" of choosing a certain idea).
Last year I took my first steps to taking a shotgun approach. This year I plan to take it further.
Posted by Sam on Jul 09, 2008 at 12:00 AM UTC - 5 hrs
Your boss gave you three weeks to work on a project, along with his expectations about what should be done during that time.
You started the job a week before this assignment, and now is your chance to prove you're not incompetent.
You're a busy programmer, and you know it will only take a couple of days to finish anyway, so you put it on the back-burner for a couple of weeks.
Today is the day before you're supposed to present your work. You've spent the last three days dealing with technical problems related to the project. There's no time to ask anyone for help and expect a reply.
Tonight is going to be hell night.
And you still won't get it done.
What can you do to recover? Embrace failure. Here's how I recently constructed (an anonymized) email along those lines:
Take responsibility. Don't put the blame on things that are out of your control. It's a poor excuse, it sounds lame, and it affords you no respect. Instead, take responsibility, even if it's not totally your fault. If you can't think of an honest way to blame yourself, I'd go so far as to make something up.
I've been having some technical troubles getting the support application to work with the project.
To compound that problem, instead of starting immediately and spreading my work across several days, I combined all my work this week into the last three days, so when I ran into the technical problems, I had very little time to react.
After trying to make the support application run on various platforms, I finally asked a teammate about it, where I learned that I needed to use a specific computer, where I did not have access.
As such, I don't think I can meet your expectations about how much of the project should be done by tomorrow.
State how you expect to avoid the mistake in the future. Admitting your mistake is not good enough. You need to explain what you learned from the experience, and how that lesson will keep you from making a similar mistake in the future.
I just wanted to say that I take responsibility for this mistake and in the future, I will start sooner, which will give me an opportunity to receive the feedback I need when problems arise. I've learned that I cannot act as a one man team and I by starting sooner I can utilize my teammates' expertise.
Explain your plan to rectify the situation. If you don't have a plan for fixing your mistake, you'll leave the affected people wondering when they can expect progress, or if they can trust you to make progress at all. Be specific with what you intend to do and when you will have it done, and any help you'll need.
I already sent an email request to technical support requesting access to the specific computer, and await a response.
In the mean time, here's how I expect to fix my mistake:
a) I need to run the support program on data I already have. It will analyze the data and return it in a format I can use in the next process. I can have this completed as soon as I have access to the machine, plus the time it takes to run.
b) I need to learn how to assemble another source of data from its parts. I have an article in-hand that explains the process and I am told we have another support program that will be available next week. I do have the option to write my own "quick and dirty" assembler, and I will look into that, but I do not yet know the scope.
c) I need to use another one of our tools on the two sets of data to get be able to analyze them. Assuming I am mostly over the technical problems, I wouldn't expect this to cause any more significant delay.
d) Finally, I'm unsure of how to finish the last part of the project (which is not expected for this release). If possible, I'd like to get feedback on how to proceed at the next meeting with our group.
After that, close the email with a reiteration that it was your fault, you learned from it, you won't let it happen again, and that it will be resolved soon.
Since I rarely make mistakes, I'm certainly no expert at how to handle them. Therefore, I pose the question to you all, the experts:
How would you handle big mistakes? What strategies have worked (or failed) for you in the past?
Indeed, that code is hard to understand, and comments would clear it up. And I'm not trying to pick on Peter (the code is certainly not something I'd be unlikely to write), but there are other ways to clear up the intent, which the clues of str, pat, * and ? indicate may have something to do with regular expressions. (I'll ignore the question of re-implementing the wheel for now.)
For example, even though pat, str, idx, ch, and arr are often programmer shorthand for pattern, string, index, character, and array respectively, I'd probably spell them out. In particular, str and array are often used to indicate data types, and for this example, the data type is not of secondary importance. Instead, because of the primary importance of the data type, I'd opt to spell them out.
Another way to increase the clarity of the code is to wrap this code in an appropriately named function. It appears as if it was extracted from one as there is a return statement, so including a descriptive function name is not unreasonable, and would do wonders for understandability.
But the most important ways in which the code could be improved have to do with the magic strings and boolean expressions. We might ask several questions (and I did, in a follow-up comment to Peter's):
Why are we stopping when patArr[patIdxEnd] EQ '*' OR strIdxStart GT strIdxEnd?
Why are we returning false when ch=="?" and ch!=strArr[strIdxEnd]?
What is the significance of * and ?
In regular expression syntax, a subexpression followed by * tells the engine to find zero or more occurrences of the subexpression. So, we might put it in a variable named zeroOrMore, and set currentPatternToken = patArr[patIdxEnd]. We might also set outOfBounds = strIdxStart GT strIdxEnd, which would mean we continue looping when currentPatternToken NEQ zeroOrMore AND NOT outOfBounds.
Similarly, you could name '?' by putting it in a variable that explains its significance.
And finally, it would be helpful to further condense the continue/stop conditions into variable names descriptive of their purpose.
In the end, regular expression engines may indeed be one of those few applications that are complex enough to warrant using comments to explain what's going on. But if I was already aware of what this piece of code's intent was, I could also have easily cleared it up using the code itself. Of course it is near impossible to do after the fact, but I think I've shown how it might be done if one had that knowledge before-hand.