Nathan's Tech Blog

Friday, June 02, 2006

EJB Stateful Session Beans – Where did the Garbage Collector go?

Summary: Asking the developer to call the "remove" method on a stateful session bean is as bad as the "delete" operator in C++. We’ve got garbage collection with POJOs. Why did we revert to manual instance management with EJBs?

For many years I've been avoiding Java Enterprise Edition. It seemed rather pointless to me. It was supposed to help me to write less code (so I wouldn't have to write remoting, transactions, and stuff like that), but every time I looked at it I came to the conclusion that it would cause me to write more code. Java EE 5 seems like it might actually help me to write less code! So now I'm finally looking seriously at EJB. Therefore, I'm sure the following complaint is nothing new.

The Java language brought a number of great programming features into the mainstream. In my opinion, one of the best features popularized by Java is automatic garbage collection. This frees the developer from maintaining object instance lifecycle and memory allocation/deallocation.

It would be reasonable to assume that container-managed beans would improve and enhance object instance lifecycle management to something beyond the garbage collector. Unfortunately, to some degree container-managed Enterprise JavaBeans take us a step backwards in the area of automatic object lifecycle maintenance. More of the responsibility of managing lifecycle is back in the hands of the developer. Even the latest EJB3 spec does not seem to correct this limitation of the EBJ programming model.

Let's compare POJO instances and EJB3 Stateful Session Bean (SFSB) instances.

With a POJO, when you want an instance, you create it with the "new" operator. As long as something in your code maintains a reference to the instance, it will not be deleted and garbage collected. Once all references are removed, it is immediately available to be cleared out of memory by the garbage collector. Java supports a "finalize" callback which will be called (theoretically) before the garbage collection removes the object. (Yes, I know this is oversimplification.)

Now let's examine EJB3 Stateful Session Bean (SFSB) instances. When you want an instance, you make a JNDI lookup, which instantiates an instance of the bean in the container and a stub to you. As long as you continue to utilize methods on the stub (which are delegated to the bean), the bean will not be deleted and garbage collected. If keep a reference to the stub, but fail to make any calls to the object for a set timeout period, the object will be removed. Once all references are removed, you must still wait for the timeout period before the instance is available to be cleared out of memory by the garbage collector. If you would like the instance to be eligible for garbage collection earlier, you must call the "remove" method. EJB3 supports the PreDestory callback which will be called before the bean is destroyed.

So, with EBJ3 Stateful Session Beans, we have moved away from a reference-counted, automatically managed object instance lifecycle. We have now reverted to a developer-managed instance lifecycle with a "session timeout" tacked onto the side. The "remove" method smells a lot like a "delete" operator, putting the responsibility on the developer to destroy the instance.

Even worse than requiring the developer to delete objects, it seems that at any moment a SFSB stub could become unusable because the container-managed SFSB was removed due to a period of inactivity. Of course, the container could be configured with a longer timeout, but then resource management will not be as efficient, since unusable beans (who’s stubs have long since disappeared) may still be taking up memory in the container, passivated to disk (which burns CPU & I/O), and then finally destroyed long after they should have been destroyed.

So, how should the Stateful Session Bean lifecycle be managed? In my opinion, the lifecycle of a SFSB should follow the standard POJO lifecycle model. The actual container-managed bean instance that is behind the stub should have its lifecycle tied directly to the lifecycle of the stub. When no references to the stub remain, the SFSB should be automatically removed and become available for garbage collection. As long as a reference to the stub remains, the SFSB behind that stub should not time out (or at least I should be able to configure a really long timeout, taking comfort in the fact that the loss of the last "active" stub will cause the instance to be automatically removed before the timeout).

Do any containers actually implement the SFSB lifecycle in this way so as to avoid the "remove" method?

Tuesday, January 24, 2006

This is the year Ruby will overtake... ABAP?

So often I hear about "Ruby the Java killer."

I just saw the "TIOBE Index", which rates the popularity of programming languages by measuring hits on various search engines. Not too scientific, but it seems like a good "ballpark" indicator.

Java is in position #1, with over 22% popularity (and that's a 4% increase from last year).

I looked for Ruby, expecting it to be somewhere near the top. To my surprise (and disappointment), it was down at number 22, nipping at the heels of... ABAP? What in the world is ABAP? Turns out that it is the language of SAP's application server, and that it is similar to COBOL [yuck] (which happens to be #13 on the popularity list).

Don't get me wrong. I think Ruby is a great language. It's popularity is sure to increase in 2006, and for very good reason. I certainly hope that Java folks are paying attention to Ruby and Rails and that they incorporate some of the great concepts pioneered by Ruby developers into Java and its frameworks. I'm watching JRuby closely. I also hope to be able to write some Ruby apps soon. From playing with it so far, that looks like that will be fun.

But Java is still fun for me, and it is still holding its own. In a very strong way.

Keep learning, but don't forget what you know.

Thursday, January 05, 2006

Why Tapestry's Crazy Rewind Is a Good Thing

For the past year and a half I've been playing around with a web application framework called Tapestry. It's a component-based framework originally written by Howard Lewis Ship and now maintained by a community of developers and users. It has many similarities to the Java Server Faces framework specification.

However, it also has some important differences. One of the most complained-about and most misunderstood aspects of Tapestry is known as the "Rewind Cycle". Recently there has been talk of removing the "Rewind Cycle" altogether. This would be an unfortunate loss.

The "Rewind Cycle" is really more of a "Page Render Replay". When Tapestry processes an incoming form, it needs to "replay" the submitting page's rendering step in order to find all of the components and their appropriate bindings. This is actually very similar to the JSF "Restore View" step, which is the first step in the JSF Standard Request Processing Life Cycle (http://java.sun.com/j2ee/1.4/docs/tutorial/doc/JSFIntro10.html). JSF usually restores the view either from serialized data in a hidden form variable or from session data stored on the server.

Tapestry, on the other hand, restores the view by re-rendering part (or all) of the page. This is the "rewind cycle" (which, of course, would be better named "replay phase" or "restore phase"). This approach to replaying thepage to restore the view has both negatives and positives.

First, the negatives.

The components that make up the page often depend on data stored in the database or in a session variable. It is possible that, between the time the page was originally rendered and when the frm POST is processed, the data could have changed. In this case, replaying the page's rendering to restore the view will yield components that are different from when the page was originally rendered. This will cause a mismatch between the data being POSTed to the application and the components intended to accept that data. This leads to the infamous "Stale Session" error in Tapestry.

This can sometimes be mitigated by using a blend of Tapestry's "page replay" strategy and JSF's "serialized data in a form field" strategy. Some tapestry components (such as "For" and "If") store data in hidden form fields. Then during "rewind" (replay/restore), the component will build itself using that data, thus providing the exact same results during "rewind" as during the original render.

Second, because Tapestry applies values from the request into the model and calles listener methods DURING the "rewind cycle", model's data may change in the middle of the re-render, and this change in the data can change the remainder of the page. This leads to a "Stale Session" error, since the components in the re-render will now no longer match the components of the original render. This issue can be difficult to mitigate.

(Note that JSF always applies request values and executes events AFTER the "restore view" phase, not during. Note also that Tapestry 4 mitigates half of this issue by "deferring" the firing of most listener methods until after the form's values have all been processed. In this case, Tapestry provides more options (both immediate and delayed event processing, while JSF only provides delayed event processing.)

Well, what about the positives?

The biggest positive is a variant of one of the negatives above. It comes from the fact that Tapestry performs the "Restore View", "Apply Request Values", and "Process Events" phases all simultaneously. This, while being confusing, is very powerful.

The result is that tapestry allows dynamic and arbitrarily complex bi-directional value data binding, including easy handling of loops. Phew... that's quite a mouthful!

Bi-directional value data binding is one of Tapestry's biggest strengths. I know very few frameworks (a.k.a. none) that do it quite as well as Tapestry does.

This is especially true when loops are involved. For example, if I want to loop through a list of values and bind each value to a TextField component, I can do that. As long as I don't let the length of the list of values change between the original render and the form POST, then Tapestry will automatically handle the bi-directional binding, even if I bind to the loop variable! That's because it is REPLAYING the loop while it is processing the POSTed values and binding the values back into my model.

Below is example using NON-TAPESTRY pseudo code. Even though it is not Tapestry, you could easily create a one-to-one conversion of this into Tapestry.

If the "rewind cycle" is truly removed from future versions of Tapestry, some very creative solution will need to be used to retain this functionality. For me, this is what makes Tapestry stand out above the others in the crowd. I can't convert my app to JSF and Seam because only Tapestry supports this complexity of bi-directional binding, which is critical for the application that I'm building. This flexibility and power needs to be preserved in future versions of Tapestry. If the rewind cycle is replaced, it needs to be replaced by something better.

---------------------------------------
Model classes:

Description: This pseudocode implements an page similar to an online survey. The question definitions are stored from a database. The definitions determine what type of question is asked, such as whether it answered by a text box, a radio button group, or a set of check boxes. The application dynamically builds the form which is automatically bound to the model representing the list of answers.

I'm using pseudocode instead of real code because I'm lazy and because my psuedocode shorter than real code. It should be easy to understand.
---------------------------------------

// note: this is pseudo code!
class Question
{
String idCode
String text
AllowedValues allowedValues
String defaultStringValue
List<String> defaultStringListValue;
}
abstract class AllowedValues
{
String type;
}
class AllowedValuesSingleSelect extends AllowedValues
{
List<Option> options;
}
class AllowedValuesMultiSelect extends AllowedValues
{
List<Option> options;
}
class Option
{
String valueCode;
String text;
}
class AllowedValuesString extends AllowedValues
{
int maxLength;
}
class AllowedValuesInteger extends AllowedValues
{
int min;
int max;
}
class Answer
{
Question question
String stringValue;
List<String> stringListValue;
}
class AnswerList extends Map<String,Answer>
{
AnswerList(List<Questions> qs)
{
// for each question, create an answer object with default value
// and a reference to the question
}
}


---------------------------------------
The view:
---------------------------------------

<html>
<body>

<!-- note: this is pseudo code, not tapestry code and not HTML and not JSF -->
<formComponent>

<foreach question in listOfQuestions>

$question.text

<if question.allowedValues.type=="SINGLE_SELECT">
<radiogroupComponent bindVariable=answers[question.idCode].stringValue>
<foreach option in question.allowedValues.options>
<radiobuttonComponent value=option.valueCode> $option.text<br/>
</foreach>
</radiogroupComponent>
</if>

<if question.allowedValues.type=="MULTI_SELECT">
<multiselectComponent bindVariable=answers[question.idCode].stringListValue
valueOptions=question.allowedValues.options/>
</if>

<if question.allowedValues.type=="STRING">
<textBoxComponent bindVariable=answers[question.idCode].stringValue
maxLength=question.allowedValues.maxLength />

</if>

<if question.allowedValues.type=="INTEGER">
<textBoxComponent bindVariable=answers[question.idCode].stringValue
validation=integerValidator(min=question.allowedValues.min,max=question.allowedValues.max) />

</if>

</foreach>

<submitButton listener="submitForm"/>

</formComponent>
<!-- note: this was pseudo code, not tapestry code and not HTML and not JSF -->

</body>
</html>


---------------------------------------
The controller:
---------------------------------------

// note: this is pseudo code!
class PageWithQuestions extends BasePage
{
DatabaseObject database; // injected from the container

List<Question> listOfQuestions; // should start as null each time page is pulled from pool to be processed
AnswerList answers; // should start as null each time page is pulled from pool to be processed

void preparePageForRenderOrRewind()
{
if (listOfQuestions==null) listOfQuestions = database.loadQuestions();
if (answers==null) answers = new AnswerList(questions);
}

public String submitForm()
{
database.saveAnswers(answers);
return "nextPage";
}
}

Thursday, October 06, 2005

Are Annotated Java Classes Still POJOs?

The Spring framework, as a lightweight container, prides itself in being able to "wire together" Plain Ordinary Java Objects (POJOs) through dependency injection. Most descriptions that I've read of the dependency injection model of the up-and-coming EJB3 specification claim the same. (For an example, see this ONJava article comparing Spring and EJB3.)

However, while Spring specifies the "wiring rules" in XML configuration files (deployment descriptors), EJB3 relies on Java 1.5's annotations to specify how objects are wired together.

So that brings us to the question, "Are annotated Java classes still POJOs"?

Personally, I think not. Why?

One definition of POJO might be: "A POJO is a Java class that is not required to extend a particular superclass or implement a particular interface."

I think a better definition of a POJO would be: "A POJO is a Java class that is not required to reference or use another resource in any way."

The second definition is more generalized. To run or compile a POJO, I should not be required to reference any third-party Java resource, whether it be a class or an interface (including annotation interfaces). For that matter, I shouldn't be required to use any core Java resource in my POJO.

(To be fair, it may be possible to run annotated classes in a JVM without requiring that the annotation interfaces exist in the classpath. Someone please correct me if I'm wrong. Either way, you certainly need the annotation interface in order to compile the class.)

According to this definition, Spring wires POJOs together, but EJB3's dependency injection does not, because it doesn't do dependency injection without annotations. Similarly, JSF's faces-config.xml defines POJO relationships; JBoss Seam's slick annotated "bijection" does not, because you can't use bijection without referencing the annotations. As a final example, Hibernate (with XML) persists POJOs; the annotated entity beans of the new Java Persistence API 1.0 aren't really POJOs.

Note that the Java Persistence API is able to persist your true POJOs using XML declarations, much like Hibernate. See chapter 10 of the Java Persistence API public review for more information. However, that is precisely my point: without the XML alternative, you wouldn't be able to persist any old Java object unless you have the ability to modify the original source and directly tie it to an external identifying resource (the annotation).

So, does this mean that annotations are bad? No, it does not. Annotations bring the metadata closer to the actual object, which certainly can improve development efficiency. Certainly the case can be made that annotations done right, combined with their respective containers, bring Java closer to Ruby on Rails in both fun and efficiency.

And does this mean that XML deployment descriptors are good? Not necessarily. I'd rather see these descriptors defined in something a bit more lightweight, such as YAML or JSON. But it seems to me that external descriptors certainly have their place.

In conclusion, annotated objects, for the simple reason that you can't compile them without needing external resources, are not truly "Plain Ordinary Java Objects." They have been made un-plain and un-ordinary by their annotations.

It's a simple matter of definition.

Monday, August 22, 2005

ActiveRecord and Hibernate code comparison

First, let me say that I'm not anti-Ruby or anti-Rails. Ruby looks really cool, and I hope to dig in and build a Rails web app "real soon now."

However, I feel that various Java/RoR comparisons are often quite one-sided. For an example, check out this comparison of Hibernate and ActiveRecord (RoR's O/R mapping engine) on lesscode.org.

So, what's wrong with this example, which shows how 30 lines of Java and XML Hibernate code gets condensed to three lines Ruby/Rails/ActiveRecord code?

First, this example uses all of the ActiveRecord defaults and none of the Hibernate defaults. The Hibernate XML mapping, by using defaults, could easily be simplified to something like this:

<hibernate-mapping>
<class name="models.Order">
<id name="id" unsaved-value="null">
  <generator class="native"/>
</id>
<set name="items" cascade="all">
  <key column="order_id"/>
  <one-to-many class="models.Item"/>
</set>
<property name="name" length="50"/>
</class>
</hibernate-mapping>

Second, this example doesn't use the "state of the art" Java persistence. That would be Hibernate in JDK 1.5 with EJB 3.0 annotations. With that, the code might look like this:


@Entity
public class Order {
  private Set items;
  private String name;
  private Long id;

  @Id(generate=GeneratorType.AUTO)
  @NotNull
  public Long getId() { return id;}
  public void setId(Long id) { this.id = id;}

  @OneToMany
  @JoinColumn(name="order_id")
  public Set getItems() { return items;}
  public void setItems(Set items) { this.items = items; }

  @Length(max=50)
  public String getName() { return name; }
  public void setName(String name) { this.name = name; }
}

(Note: I haven't tried to compile this, but it's pretty close.)

Third, the ActiveRecord example doesn't show the whole story. It won't work (and really is quite worthless) without a corresponding table in a database. So, the ActiveRecord example should include something like this:


CREATE TABLE ORDERS
(ID INT NOT NULL AUTO_INCREMENT,
NAME VARCHAR(50),
PRIMARY KEY (ID));

Note that the Java (with optional XML) examples above can be used by Hibernate to generate the SQL from the O/R mapping definition.

So, we're left with this...
Hibernate w/XML: 11 lines of Java code, 12 lines of XML
Hibernate w/Annotations: 11 lines of Java code, 6 lines of annotations
ActiveRecord: 3 lines of Ruby code, 4 lines of SQL

So, Ruby on Rails still wins (with less than 1/2 of the code). But not by as much. And Java's main problem with code length now boils down to those messy getters and setters.

Two Blogs?

Well, since I can't seem to create "categories" here in blogger, and because some of the techie folk might not care to here about my thoughts on our purpose and the meaning of life, I've created a second blog to publish my technology thoughts.

Everyone's certainly welcome to check out my other blog, named "Purposeful Experiment." Eventually it should have some fun philosophical and theological thoughts about the meaning of life. Until then, enjoy my musings on Java, web applications, and other fun tech stuff.