Thursday, March 8, 2012

Why I'm Moving Away from the Play Framework

I've been using the Play framework since I started at RtR 3 months ago. Last week, I made the decision that no new services will be written in Play from that point forward. It started out as a great little framework that was pretty quick to learn and easy to use, but it's turned into something that I would not recommend anyone use for serious production applications. What happened?


First, I lost faith in the developers.

One of the first things that annoyed me about Play was the inability to run a single test from within a play test class inside your IDE. I suppose the thinking was that you will always run the play test app, or something, but I prefer to leave my IDE as little as possible when I'm working, and running an entire test class worked fine. So, being the good little open source programmer that I am, instead of bitching I rolled up my sleeves and fixed the bug. It was a pretty trivial fix. I even wrote a test case. Then I put in a pull request and waited.
After submitting the pull request, I commented on the pull request, commented on the ticket, and finally sent an email to the mailing list. And the response I got was basically that the team is too busy working on the next generation of the product to absorb fixes for the older generation. Having worked on open source projects myself, I understand what it's like to have limited bandwidth to look at changes. But if the project team's bandwidth is so limited that they can't even afford to look at small fixes like this, it seems like the project is basically abandoned. At that point I lost faith that I could rely on the community to support the 1.X branch of this product. Not necessarily a dealbreaker, but definitely a bad sign. 

Then, I lost faith in the platform.

We started to hit some serious bugs in the platform during a big push on a complex service. First, our developers that used Mac and Windows hit a bug similar to this, where they just simply couldn't get the app to work no matter what they did. It worked fine in linux, but even a clean checkout would fail to run for them. It was inexplicable, irritating, and we lost a couple of days of development work trying to get around it (rolling back checkins, pulling out modules, poring through stack traces). By this point, I had lost faith in the community, so I didn't see the point in going to them for help. Fortunately, we did finally get around it (it seemed to be a bug in the CRUD module), but we were all really frustrated and annoyed with the framwork after that experience.


Finally, I lost faith in my own ability to debug the framework.

The issues above were enough for me to want to move off of Play for new projects. The thing that caused me to move off of Play for projects that are already in development (but not in production) was this: At some point, we had written a migration job in Play for a major data migration. We discovered the strangest thing would happen. The job would run across several job threads, and at some point, one of the threads would hang. But it would not hang in a way that I have ever seen a JVM thread hang. The thread was in RUNNABLE state, and it was in a HashMap method (either get or put) and it was just sitting there. Not doing anything. No locks, no database or other IO, plenty of memory, plenty of resources, just sitting in that HashMap.get method, hanging out.
Now, maybe you've seen that before (and if you have, please leave a comment!). But I have seen a lot of JVM issues in my day, and this is a new one. There was no reason for this thread to be hung. And yet it was. I can debug just about anything you can throw at me in Java, but I had absolutely nowhere to start looking to debug this issue, except a vague suspicion that it was related to the way the framework was rewriting the classes under the covers. That is a dealbreaker, ladies and gents. I could've probably debugged why the module was causing the app to crap out for my developers, if given enough time. But I cannot say with any certainty that I could debug whatever the hell was causing that thread to hang.

If I felt the developers supporting play were committed to building a real community of support around the 1.X version, I might have stayed with it longer. It's a giant pain to find something else that is easy, lightweight, supports JPA and doesn't force me to write XML. But I can't use a product that I know has issues even I can't debug, and a team I don't trust to maintain the product to the standards my team needs to confidently use it in production.

47 comments:

  1. When you see the stuck JVM thread, is one of your cores pegged at 100%? The typical bug I've seen that causes a thread to get stuck in a hash map method in a Runnable state is a concurrent modification of the hash map. This can cause the get/put calls to get stuck in an infinite loop.

    A quick google turned up this stack overflow thread which confirmed some of my (admittedly hazy) memory on this stuff.

    http://stackoverflow.com/questions/1003026/hashmap-concurrency-issue

    Hope that helps.

    Sujal

    ReplyDelete
  2. Fair point sujal. I don't think the cpu was pegged in this manner but it could have been due to this issue. We aren't running the job any more so it's kind if a moot point but I will keep this in mind for the future.

    ReplyDelete
  3. Hey Camille,
    Too bad you've been encountering these issues. I've been keeping PlayN on my radar, but it sounds like there's still some fundamentals to sort out.

    Best of luck on the next pick...that JPA / XML comment you made was a good summary.

    ReplyDelete
    Replies
    1. I'm not sure if you meant PlayN or Play, as those are two different frameworks. PlayN is Googles new framework for cross-platform game development. Play is a webapp framework. Very different beats and completely un-connected to each other.

      Delete
  4. Just a quick plug for the Stripes Framework. http://www.stripesframework.org/display/stripes/Home http://pragprog.com/book/fdstr/stripes

    The most enjoyable experience doing Java web development I've found so far. The second link is also a fantastic book.

    ReplyDelete
  5. Hi there,

    In our defense, like any open source framework, we have other constraints. We are not backed up by huge company and we are trying to provide as much as we can. We were kept by surprised by the huge popularity of the framework, but be assure that things will change soon. Sorry about your experience with Play and we will make sure to fix any problem in due time.

    Nicolas

    ReplyDelete
  6. So, was it bug in Play - improper usage of not protected classes?

    BTW, you've chosen whilefalse to be protected against whiletrue infinite loops? :)

    ReplyDelete
    Replies
    1. Unclear where the bug was. It wasn't really worth it to spend a huge amount of time trying to debug.

      I liked the nerdity of elided branches... thoughts that I otherwise would not have fleshed out.

      Delete
  7. i strongly advise taking a look at grails. with grails 2.0 it has totally became an advanced great development platform.

    ReplyDelete
    Replies
    1. Yup, I'm using Grails and developing plugins, it's loads of fun.
      But I guess Play! is cool if you need rather simple stateless apps to be deployed quickly.

      Delete
    2. We are moving away from Grails to Node.js due to the the "too much magic" behaviour. There are quite a few abstraction layers and they will leak. You can get funny stack traces with GORM, Hibernate, Spring and your own code happily interacting. We are really satisfied with Node.js because it's really possible to have thin, understandable abstraction and the code of third-party modules is not the abstraction orgy that Spring and Java generally seems to encourage.

      Delete
  8. welcome to java bro, get used to it

    ReplyDelete
  9. Have you decided which framework you will use for future projects?

    ReplyDelete
    Replies
    1. I'm leaning towards dropwizard (http://dropwizard.codahale.com/). We really don't need any of the view layer stuff that the bigger frameworks provide, and I like the idea of building on its simplicity.

      Delete
  10. I urge you and everyone to try node.js

    ReplyDelete
  11. I second (or third?) the suggestion to take a serious look at Grails. Any Java developer can pick up enough Groovy to be productive in a day, and you can always continue writing all of your logic in Java, if you are so inclined - the Java/Groovy integration is seamless.
    I used Play 1.2.x for the past six months - I also run into the weird controller issue - spent a day rearranging code to get Play not to throw up. What really caused me to give up on Play is that I lost faith in the developers when they decided to embark on a big rewrite for 2.x instead of fixing the issues in 1.2.x. All geeks like to play with the latest toys (*cough* Scala *cough*), but I rather not put my faith in a team that goes after the new shiny instead of finishing what they started. Everything in 2.x has changed, and there is *no* migration plan for current users - the answer of the Play developers, that 1.2.x apps don't need to be migrated just boggles the mind. Who says they won't decide to rewrite the framework again for 3.x in Clojure, or whatever the new shiny is at that time?
    I doubt that SpringSource would be as cavalier in their treatment of their users, so it's Grails for me.

    ReplyDelete
  12. If you're in search for a new web framework to try I can really recommend JSF 2. Contrary to what you might have heard about JSF 1, JSF 2 is extremely simple yet very powerful.

    Above all it's got a very vibrant community with lots of external projects that provide extensions, components etc for it (E.g. PrettyFaces, PrimeFaces etc).

    The core implementations of JSF (Mojarra and Myfaces) have a public JIRA that's very open to suggestions and patches from users, so that's also definitely a plus.

    ReplyDelete
  13. D'oh. Thanks @R Kyle Murphy for the clarification.

    I'm often doing a few web contracts, so I'm also voicing another vote for Grails.

    Anyways, again, best of luck to you and the team Camille!

    ReplyDelete
  14. It sucks to lose a lot of time in a major investment like this, but I'm grateful you could share your experience with us all. Play gave me pause from the beginning, but I never put an appropriate amount of energy into trying it out. Now I'm glad I didn't.

    ReplyDelete
  15. I have a similar problem with Play! Framework and I think the project has done this hit-and-run approach couple of times already. They 1.0-1.1 upgrade was not as straightforward as you would expect. The 1.1-1.2 upgrade for us was a huge PITA. Now the 1.2-2.0 upgrade I think we won't even make. The 1.2 is getting no more features and only bug fixes and it was released less than a year ago. Bug reports, pull requests take a long time to get attention because the framework devs are busy with yet another version.

    Lets continue with this approach for a second and assume all future update paths will be just as tough as they have been so far. 2.0 comes out with the major overhaul (Scala, Akka, SBT etc.) and then less than a year later, BOOM, 2.1, no good upgrade path, 2.0 only for bug fixes and everybody is encouraged to use the latest and greatest. Java software projects tend to last longer than this.

    Of course for the Framework authors it is great to be working with cutting edge technologies
    , try new things here and there, put maintenance into bugfix mode and have the bare minimum time to go over patches. From the users point of view of course it sucks, your framework version not getting attention in less than a year after it has been released. If you upgraded later than that then the window is even smaller. Scary shit.

    ReplyDelete
  16. A thread can loop forever in an HashMap when the HashMap gets corrupted by concurrent write access. The data structure becomes altered in such a way that loop never ends.

    ReplyDelete
    Replies
    1. Yeah, as I mentioned above that could've been it. The call trace wasn't in our code, though, so I'm not really sure that we would've been able to get around the improper synchronization issue without jumping through major hoops.

      Delete
  17. Why not create a fork of play (1.2) together with some engaged users and integrate patches / pull requests?

    Cheers,
    Martin

    ReplyDelete
    Replies
    1. I'm also wondering why play developers don't invite engaged users to help maintaining play 1.x...

      Cheers,
      Martin

      Delete
    2. I considered it, but I already have one open source effort I work on and I don't really have the time to do another one. I do wish the Play devs had gone more of a community route on 1.x, it seems like it would've been more of a win-win but who knows.

      Delete
  18. This might look like a shameless plug, but considering a discussion here, this is appropriate.
    I while ago I developed a Java framework inspired by Ruby on Rails called ActiveWeb: http://code.google.com/p/activeweb/
    as well as Active Record implementation in Java ActiveJDBC http://code.google.com/p/activejdbc/
    I have been enjoying these for a couple of years using them on all my projects - commercial and private.
    Development was being done in parallel with Play in 2009, with Play hitting the market a few moths earlier.

    While there are similarities between ActiveWeb and Play, there are notable architectural differences as well, see my response to 'opensas' on my blog: http://igorpolevoy.blogspot.com/2011/07/stop-hating-java-2.html

    Anyhow, those interested in no XML, conventions-based development and Rails -like productivity (including runtime compilation) may look into ActiveWeb as an alternative to Play and Grails.

    thanks,
    Igor

    ReplyDelete
  19. I'm sorry that you are leaving Play. I think you are making a mistake.

    I know it is late, but I have now merged your pull request (https://github.com/playframework/play/pull/392) into both 1.2.x-branch and master-branch.

    As a Play committer I have to say that we are NOT leaving play 1.x to die..

    When I fix bugs or merge pull requests I still do it in two branches, both 1.2.x for the upcoming 1.2.5 release and in master, for the (possible) 1.3 release. New features are only (read: mostly) put in master branch.

    The problem is, as Nicolas Leroux pointed out, that many of us are doing this for free. And we have bills to pay.. So we have to have a daytime job also.

    A tip to others with pending (good) pull requests/fixes are to just build your own version of play from your own fork - since you know the (good) fix for sure will be included in a future version of the framework.

    - Morten Kjetland

    ReplyDelete
  20. Comment on the "Lost my faith"

    I know this was posted a while ago but I thought I would offer my two cents.

    From the issue you describe, essentially a deadlock with all your threads in the runnable state, it sounds like a thread starvation issue. Play is built on/with Akka which is a fantastic library for concurrency. Akka makes concurrent programming easier by providing thinks like Actors, Agents and composable Futures. All these things use underlaying thread pools. If one is not careful it is possible to create thread starvation deadlocks in the following manner.

    Lets say there is a pool with only two threads and you have two async services, A and B. Service A depends on a result of service B to complete. If service A uses a blocking get on the result of service B you can get a deadlock if the services execute on the same thread pool and the following scenario occurs.

    Client 1 makes a request to service A
    Client 2 makes a request to service A

    service A ( for client 1 ) makes a call to service B and calls a blocking get on the future result which ties up the thread.
    service A ( for client 2 ) does the same thing

    both threads are blocked but will still show up as RUNNABLE in a thread dump and service B has no thread to respond on.

    I know about this because I ran into the same issue and it took a while to diagnose. If you adhere to the following best practices these deadlocks are pretty easy to avoid.

    1. Always compose your futures. Akka futures, unlike java futures are composable meaning that if you need to result of one to produce the result of another, use the onComplete callback or map ( if using Scala ).

    2. If two services interact and for some reason you have to use a blocking get make sure they have their own thread pools, which can be configured in Akka with assiging Dispatchers.

    As someone who has done a lot of programming with Akka and Grails I would have to whole-heartedly disagree with the other commenters Grails recommendation. I find Grails is a nightmare in maintainability and scalability. Gorm has to be the worst way to maintain data integrity and predictable scaling ever. It works really well if you need to create simple blog application for a demo or something but for large high volume applications no thanks. Also, if you have trouble with concurrency with Play/Akka, your trouble is only magnified with groovy and gorm.

    2 cents. all the best,
    Andy

    ReplyDelete
  21. We had very good experience with play. All products now are in production. So, play 1.2.5 is ready for production, no problem to run tests from IDE - in IDEA it works just fine at least. And I'm happy to work with this framework.

    I remember some Issue with applying evolution scripts (we fixed it for our own branch, but it was not a bug, we just expected different behaviour for our own case). I took us 2 hours and no problem.

    I recommend Play! It is a light in java word of web-frameworks!

    ReplyDelete
  22. Congratulations. You made the right choice.

    Unfortunaly we choosed this framework, I was not involved in the decision and nobody checked the history of the framework.

    Just for make you laugh: the last version of play was 2.0.4, 2 months ago.

    They released version 2.1. You might expect that there is some upward compatiblity, unless some big issue in 2.0.4 should be fixed and if the fix breaks the compatibility.

    Nope.

    One has to change substantial parts of the code, so that it is just not possible to test the advantages and issues of the new version on the same code base than 2.0.4.

    And why ? Just because they didn't like some class names, and several other cosmetic reasons.

    And the worst is, they find this perfectly normal.

    My 2 cent's guess is that Play 2. developpers (us!) will have to throw away everything and restart from scratch when 3.0 will come if they want to keep using a maintainted framework. Or get the complete source and do the framework's maintenance themselves. Good luck.

    And this is only one-among-many reasons I DO NOT recommend Play.

    ReplyDelete
    Replies
    1. Why should you throw away everything? I am still using Play 1.2.4, and it works just fine. And if you don't expect things to change, your wrong. Technology is always changing and so is the web. If you want to stick something that doesn't change, you should've studied politics or something. Your in the wrong field bud.

      Delete
  23. It's good to know that Play has some of the same issues as Grails 2. Grails 1 (1.3.7) was quite fun to develop with and even though the developers didn't put a lot of attention on "enterprise" capabilities (such as robustness or small war files or ability to run behind load balancers or keeping stable interfaces in a framework or, or, or, etc...) I still had great hope that Grails would be an excellent framework as it grew up. Alas, 2 came out with a lot of bad ideas (that re-write of the unit testing made no sense and only cost companies a lot of money for nothing). In addition, the Grails folks got very picky about some minor issues with Groovy Mixins so they broke the whole thing by creating their own very broken version of a Mixin and then making it so that their testing code worked differently if you were using Mixins. In the end, you had to convert to the Grails (broken) Mixin and then do a ton of work to get it to work in all environments (unit, integration, war file, etc...). Overall, Grails convinced us to start looking at Dropwizard and I was fascinated that other frameworks are pushing people towards Dropwizard. I just hope the Dropwizard community learns from the mistakes of others. It's off to an excellent start though.

    ReplyDelete
  24. What's good about 2.2.2? Typesafe Activator? What about if we have an existing DAO already, can we use Play just for the views and its controller?

    ReplyDelete
  25. My post is a little old but still relevant I think. I think the Typesafe guys have stability and backward compatibility in their roadmap, but for the time being it is certainly abrasive. Perhaps someone could step-up and inspire a little more confidence?

    I still recommend Scala Play over Node.js due to their choice platform languages:

    Scala (Pros):
    - Type-safe, leverage compile-time sanity checks to reduce test overhead and turn-around on feedback
    - Sealed traits and case classes, love these
    - Both functional and object-oriented
    - Easy to read (i.e. Cog[A >: B] and Widget[-A] instantly tell me everything I need to know)
    - Built-in functional features such as mapping, folding and options
    - Operator overloading increasing readability (e.g. { val future = actor ? "Hello" } >> function (){ var future = actor.sendMessage("Hello") }, I am particularly tired of an excess of braces, commas, and camels)
    - Brevity of expressing anonymous functions
    - Wildcards (i.e. _ and _*)
    - Encapsulation

    JavaScript (Cons)
    - Dynamic typing and duck typing in lieu of static typing
    - Fundamentally non-modular
    - No parametric polymorphism
    - No operator overloading
    - Lack of control over object lifecycle
    - Almost no encapsulation
    - Hard to read ('function' keyword, braces and commas, callback hell)
    - No IDE can seem to get context-aware code completion right.
    - Outdated basis: "JS had to "look like Java" only less so, be Java's dumb kid brother or boy-hostage sidekick. Plus, I had to be done in ten days or something worse than JS would have happened."

    In general I can instantly understand the structural relationships in a Scala project just by reading the source whereas due to JavaScript's nature as a late-bound dynamic language it often requires that I run JavaScript code and step through it manually to understand the structural relationships.

    ReplyDelete
  26. Hello there!

    I run into the HashMap's bug some time ago. But reading your writing, my felling is that you were expecting your program to run serially and got annoyed because i) it was running concurrently, and ii) it was hung in such a harmless piece of code. If you want to understand what really happened in details, you must read http://mailinator.blogspot.com.br/2009/06/beautiful-race-condition.html

    Besides that I want to know if you still have this opinion about Play Framework. I came across it almost 2 moths ago while I was coding an Akka-based project. So, do you still have this opinion?

    ReplyDelete
  27. I admire the valuable information you offer in your articles. Thanks for posting it..
    bigcitymoving.com

    ReplyDelete
  28. Asoka Packers and Movers Hyderabad is the most consistently ranked packers and Movers Company and has delivered over 56,000 homes all over India
    Packers and Movers Bangalore
    Packers and Movers Pune

    ReplyDelete