Monday, May 5, 2008

Accessing the Salesforce.com API with Ruby

I'm working on moving some data around between our system and Salesforce. The How To's I saw were a bit thin, so I'm documenting a bit of it here.

First, you need a Salesforce.com account. But just having the account isn't enough. You've got to enable the API for your organization. And even though their docs say it can be done for a "Professional Level" account, tech support says it can't.

From this page , browse to the Introducing the Force.com API doc (under Getting Started) and read the second paragraph to see if your account qualifies.

This feature is enabled by default for Unlimited, Enterprise, and Developer Editions. Some Professional Edition organizations may also have the API enabled

However, if your company doesn't have the "right" type of account, you can still tinker a bit using a free developer account. Visit the developer site to sign up.

Then you've got to fetch a security token. Here are the directions, right from an error message.

LOGIN_MUST_USE_SECURITY_TOKEN: Invalid username, password, security token; or user locked out. Are you at a new location? When accessing Salesforce--either via a desktop client or the API--from outside of your company’s trusted networks, you must add a security token to your password to log in. To receive a new security token, log in to Salesforce at http://www.salesforce.com and click Setup | My Personal Information | Reset Security Token.]

To use your security token, attach it to your password. So your submitted password would be passwordTOKEN

Now that your account is ready, you can start adding some code. I'm using the Rails Active Record adapter, ActiveSalesForce. The docs for the adapter were non-existent after you've installed. Directions are on the page above, but in a nutshell, do the following:

gem install activerecord-activesalesforce-adapter

Then put the following into your database.yml file

adapter: activesalesforce username: salesforce-username password: salesforce-password

optionally, you can (allegedly) hit the test server instead by adding

url: https://test.salesforce.com

But I have yet to log in to the test site successfully. Their accounts don't seem to propagate.

Finally, you're ready to use the adapter. I like the ./script/console to do basic testing.

You can do things like this:

class Account <> end

accts = Account.find(:all) accts.first.name

I still haven't found any docs with the list of objects available via the API and rake db:schema:dump doesn't work, but if you know the existing API, this should get you up and running with the Rails code.

If you want to just access the data from Ruby, try the following:

require 'rubygems' require 'activesalesforce'

class Account <>

end

ActiveRecord::Base.establish_connection( :adapter => 'activesalesforce', :username => 'your_username@some_domain.com', :password => 'passwordTOKEN', :url => 'https://www.salesforce.com' )

accts = Account.find(:all) accts.first.name

Tuesday, April 15, 2008

Automated Testing -- Cooking up your own language

In an earlier post, I exposed the fallacy that building automated tests required a huge investment -- unless you think 10% of your time is huge. One of the ways we reduced the time was an investment in test infrastructure. This investment significantly reduced the time and pain in authoring individual tests. A key aspect of this infrastructure was a Domain-specific language (DSL) implemented in Groovy.

Writing a language seems daunting, and it certainly seems difficult to justify. But, I already shared the numbers so we now know that it's not daunting at all. So how did we do it?

Start with dessert

One of my favorite geek terms is syntactic sugar. We chose Groovy for our DSL, since it possesses a good amount of sugar and is Java. By "is" I mean that it compiles directly to Java bytecode can invoke any Java class and be invoked by any Java class. There are several language characteristics of Groovy that make it ideal for developing a DSL:

  • Dynamic typing
  • Parentheses are not required for method invocations.
  • Closures
  • Easy dereference of class-level members ( no need for getters or setters ).
  • Ability to handle dynamic properties and methods through methodMissing and propertyMissing
  • Categories and metaclasses

Now plan the menu

After choosing a basic framework and brushing up on what's possible with Groovy, I began by writing an ideal test. I essentially wrote pseudo-code for the functions I wanted to execute in the least-verbose, easiest-to-read way. I began with simulating sensors transmitting data which is a key function for us. Here's the code for it:

Send_Data( key ) {
  Activity "01/22/2008 9:47AM".timestamp,  "thunderbird.exe",  "Window",  "Inbox for attr@6sa.com"
  Activity "01/23/2008 9:50AM".timestamp, "Eclipse", "Open File", "/home/todd/src/Foo.groovy"
}   

That's it. In just a few very readable lines of code, it's easy to ship data to the server. The code is minimalist -- meaning that each line ( and almost each character ) is directly tied to task. How did this compare to the status quo? Here's a snippet of the same functionality in Java.

List<String> data = new ArrayList<String>();

DateFormat timeFormat = new SimpleDateFormat( "M/d/y h:mma" );
String tstamp = timeFormat.parse( "01/22/2008 9:47AM" ).getTime().toString();
data.add( "Activity" );
data.add( tstamp );
data.add( "thunderbird.exe" );
data.add( "Window" );
data.add( "Inbox for attr@6sa.com" );

List<String> dataList = new ArrayList<String>();
dataList.add( StringListCodec.encode( data ));

// Repeat for the second line of data....

Map<String, String> params = new HashMap<String, String>();
params.put("key", key);
params.put("data", StringListCode.encode(dataList) );

return new DataSender( params ).run();

As you can see the DSL is significantly simpler than required Java code. In Java pre-DSL, sending data requires 4X the number of lines of code. Many of the characters and lines are just cruft required by the Java programming language -- not required by our test.

Putting it all together

The first step in putting this together is making use of Groovy's closure capability specifically as the parameter to a method.

def Send_Data( key, closure ) {
  def data = new SensorData().processData( closure );
  // Call internal utility to send data.
  new DataSender( [ key: key, data: data ] ).run()
}

In Groovy, you can always specify the last argument in a method as a closure ( note: the parameter name doesn't have to be closure ). In this example, the contents of the curly braces are passed as an executable block to the method. The method then sends the closure on to a helper class for processing.

The next class is the code to bundle the data for processing. It essentially takes the parameters and builds up an encoded String for transmission. There is a method per data type, so sending alternate types of data, like Commit, simply requires a line within the Send_Data block with "Commit" and the method parameters.

class SensorData {

 /* Method to encode Activity data */
 def Activity( tstamp, tool, type, data ) {
  dataset << [ "Activity", tstamp, tool, type, data ].encode   }    // .. There are methods for each type of data ..   

 def Commit( tstamp, tool, commitTime, filename, authorname, repositoryname, branchno, versionnum, totallines, linesadded, linesdeleted, log ) {   
   dataset << ["Commit", tstamp, tool, commitTime, filename, authorname, repositoryname, branchno, versionnum, totallines, linesadded, linesdeleted, log].encode   }    

 /* Main Method for processing data. */   
def processData( closure ) {     

  // Tell the closure to run on the current instance.    
  closure.delegate = this;        

  // Add the ability to write:   [ "this", "is", "a", "string", "list" ].encode  
  ArrayList.metaClass.getEncode = { ->
    StringListCodec.encode( delegate )
  }
 
  // Run the closure which in turns invokes the methods for each data type.
  closure()
  
  return dataset.encode
 }
}

The last piece is the date/time helper. While Java has many powerful date and calendar classes, they are extremely verbose. I used a Groovy Category helper to enable easy date entering. A Category is a class that can augment existing classes within a block. In my example, I augment the String class to support a .timestamp which takes the string and returns a timestamp. This is is quite easy to do:

public class SensorTimeHelper {
 
 public static final String timeFormatStr = "M/d/y h:mma";
 
 public static Date getTime (  String str ) throws Exception {
  return new SimpleDateFormat( timeFormatStr ).parse(str);

 }
 
 public static Long getTimestamp (  String str ) throws Exception{
  return new SimpleDateFormat( timeFormatStr ).parse(str).getTime();
 }
}

Just a taste

The reality is that you can design your language as complex as you want. The example I provided is just a taste of what's possible and how you can use it. Ultimately, the goal is to make authoring tests ( using the language ) as easy and readable as possible. I'd encourage organizations to consider implementing their own DSLs -- it's certainly not as huge of an endeavor as it would seem and yields great results.

Thursday, March 27, 2008

Doctypes hanging IE

I wanted to write up an issue we've finally identified with one of our reports (our audit report). We were seeing extended hangs from 7 seconds up to 5 minutes with the CPU pegged at 100% the entire time. I know IE has issues but this was beyond tolerating. The content of the page was a report in HTML rendered using a tool called BIRT (http://www.eclipse.org/birt/phoenix/). We bring the reports down in an XHR call done immediately after the initial page load. The report in this case was an approximately 1000+ row table; large, but nothing too excessive. Many other pages were also rendered containing BIRT reports and none of those exhibited this defect. So what was up with this one? (NOTE: No other browsers tested had issues. [tested: Firefox 2.x, Opera 9.x, Safari 3.x])

The answer was subtle and troublesome. To figure out what was causing the slowness I first tried to create a minimal test case. The only problem was I couldn't. None of my tests were failing. In fact, I built my test up to the point where it resembled the problem page almost exactly and it still wasn't failing. There was only one thing left to add to the testcase and that was the Doctype tag that real page had at the top. Eureka. The test immediately failed like I wanted. So what's that mean? I knew that doctype tags told the HTML validator what enforcement to test the HTML against but surely it wouldn't really affect a browser's capabilities like this? Well, it doesn't affect most browsers much but it makes a huge difference to IE.

   our Doctype tag:
   <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

After reading up on Doctypes at several different sites (I recommend http://www.quirksmode.org/css/quirksmode.html) I had learned a lot of tricks to try; including different doctypes with various enforcements but none of these made any difference. The only thing that helped IE was putting the page into "quirks mode" (removing the Doctype tag). Which is a unadvisable place to be, especially in IE. (Read up in the quirksmode.org page listed above to see a very helpful table of HTML/CSS issues that different browsers have in "quirks mode" vs "standards mode".)

Since removing the Doctype fixed the problem, I thought that perhaps the report had excessively invalid HTML. But, after testing the report in a HTML validator (http://validator.w3.org/check) it turns out the HTML was only slightly invalid and *shouldn't* have been causing the problem. So, it was either the volume of the code or just simply the way in which it was invalid that was beyond IE's delicate rendering capabilities when rendering in "standards mode". The only solution we've found so far without modifying BIRT's HTML renderer was to special case that particular page in IE so that it doesn't include a Doctype. (Keep in mind that in "quirks mode" IE behaves quite differently and therefore extensive additions to our page's CSS were needed to accommodate.)

The reason this is troublesome is that it really makes you wonder what else could be affected by this one of IE's shortcomings. Using the Dojo toolkit (http://dojotoolkit.org), for example, creates the opportunity for invalid HTML by using their widgets as declarative HTML. If you were to use enough widgets on page load you might recreate this same result. Also, any sites that use HTML generating code (like we are with BIRT) need to be very cognizant of the quality of the generated HTML and the volume or else risk the results we've seen.

I'm sure IE 8 will solve all these issues...right.

Tuesday, March 18, 2008

Automated Testing -- It's the thought that counts

After encountering a painfully embarrassing bug in a production release of our product, I decided to make it personal goal to increase our automated testing. Like many start-ups our focus was on building features and value into our product. While we certainly value testing and automated tests specifically, we always found an excuse to under-invest in this area. The bottom line is that automated testing is hard and requires a tremendous investment....

But wait, is it really a "tremendous" investment or is it simply the thought of it that's "tremendous?" There are always a million excuses why automated tests should be delayed. Including:
  • There really isn't a framework that applies to our unique app.
  • It will take more time to write the test than the code itself.
  • Isn't that what QA is for?
  • I can either write a test or deliver it on time.
Honestly the excuse doesn't matter. There always seems to be one. You can solve all the excuses and magically new ones appear.

My solution was the personally attack the problem. Lead by example. Absorb the pain and measure the results.

The Results

In one 10 day Sprint, I managed to develop a new testing framework highlighted by our own Domain Specific Language (DSL) and write automated tests covering basic product functions like creating teams, users, etc. Over the next few blog posts I'll elaborate on the results including the gory technical details.

In this post, I want to focus on the "pain" and "tremendous" sacrifice that I must have endured to get there. So I did cheat and reuse an existing library we had we some infrastructure for sending data, running queries, and running reports. But given that here are the stats for the effort:
ComponentTime in Active Hours
Existing Library137
New DSL and framework23
TOTAL160
TEAM WEEKLY AVG220
Our total investment in this area was less than one week of total active time for our team. This total investment is all time on these artifacts. Hm. Not a "tremendous" investment. Now this does not include any tests that were written using the framework. Here are the figures for the tests:
Number of Tests18
Average Time to Write a Test< 1 hr
Total Time on Tests~ 20 hrs

1 measly hour per test. That's it. That also doesn't seem so painful.

Finally I looked at all of our time spent on specific modules of our product over the last 75 days ( beginning in January ), and our total investment in all testing modules was only 10% of the overall time.

Tremendous is 10%

So according to real quantitative data, the tremendous pain of automated testing 10%. It's less than an hour per day. The next time a developer ( or manager ) is reluctant to invest in automated testing due to the pain inform them that only the thought is painful -- the reality is only 10%. Of course, I'd encourage to measure it like we did to prove it, and if the investment grows above 10% you can always reprioritize.

Monday, March 17, 2008

F is for Functional

Admittedly when I first heard that Microsoft developed a new language "F#", "functional" wasn't the first "f"-word that came to my mind.

During a recent customer visit, a development lead was aggressively espousing the virtues of this new language that I honestly knew nothing about. For those of you like me, F# is a functional language coming out of Microsoft Research that compiles to .NET. Meaning that you can call F# from C#/VB.NET and vice-versa. His passion for F# and functional languages interested me. He made comments about the increased productivity. And even made comments like "there should be no bugs." I love comments like that. He said he would never return to imperative languages.

For those that haven't programmed in a functional language, it's quite different than imperative languages. While the industry is debating Ruby vs. C# vs. Java, they are all fairly similar in their approach ( which I suppose is a good thing ). Functional languages, however, represent a different way of thinking. I spent some time with Lisp in college, so I am a bit familiar with the style and had the opportunity to use it a bit here. It turns out that internally we had implemented a piece of our product ( albeit small ) with Elisp -- Emac's Lisp derivation.

Most sensors are implemented in their native technology and Emacs is no different. While we inherited most of the sensor from the good folks at the Hackystat project at UH, we did have to modify it to support our revised Desktop architecture. We've implemented some base utilities for writing raw data to files in numerous languages: eLisp, Java, C#, C++, and even VIMscript. Thus, it provides an excellent representative data sample to evaluate the different languages.

LanguageTimeLOC# of words
Java40 minutes42160
eLisp10 minutes1894

So Lisp took less time, fewer lines and even fewer words. Java's style typically has more lines for readability so words is likely a better measure. This Lisp implementation includes a custom function for URLEncoding the data, while Java has one built-in. Java's strength is certainly in it's breadth and community support. There's a library in Java to solve just about every problem. However in this example, it still required fewer lines of code and less time to write a custom library to solve the exact same problem. Imagine what it'd be like if functional languages gained more popularity in open-source communities.

Here's a peek at the Elisp code for those who are interested:

(defun convert*command (cmdstr data)
"Converts the # delimited command to URL-encoded string"
(mapconcat
(lambda (item)
  (concat (car item) "=" (urlencode (cadr item))))
(append (list '("user" "") '("tool" "Emacs") (list "type" cmdstr)
      (list "timestamp" (hackystat*util*milliseconds-string (current-time))))
      (let ((cnter 0) (value)) (reverse (dolist (datum data value)
          (setq value (cons (list (concat "data." (number-to-string cnter)) datum) value))
          (setq cnter (1+ cnter)))))) "&"))

(defun urlencode ( str )
"encodes the provide str for url encoding"
(mapconcat
(lambda (c)
  (cond ( (string-match "[a-zA-Z0-9]" (char-to-string c)) (char-to-string c))
        ( (= c 32) "+")
        ( t (format "%%%02X" c)))) str ""))

So why not move everything to functional languages?

The biggest challenge with moving to functional languages is the altered way of thinking. Was the code sample above immediately understood by you?

If you can hire ( or have ) developers that are well-versed and comfortable with functional programming and thinking then it may be a good alternative. Having worked with it myself and shifted between Java and Elisp, I'll say that it is a challenging transition. I found myself want to author Elisp in an imperative way. It took a day or so to fully migrate my brain to the different way of thinking which is heavily recursive.

Functional languages are a very interesting alternative for development teams possessing the appropriate skills. There's no question that problems can be solved faster and with far fewer lines of code, and while each line may take a little more time to understand experienced developers should grasp the concepts with a little time. I would recommend committing to one style for a period of time. Shifting between the paradigms is challenging and simply a waste of time.

Whether its F# for .NET shops or Erlang for concurrent and high-availability applications, there are a number of powerful functional options available for your immediate consumption. Regardless of your choice, I'd love to get more data on people's experience with these languages. Please contact us if you're interested in working with us to collect some data.

Wednesday, February 13, 2008

G-Rails

"Ohh...Rails very cool." "No, I said G-Rails actually." [ blank stare ] "It's based on Groovy." [ blank stare ] "Groovy is a new-ish scripting language for Java. Grails is a MVC framework based on Hibernate and Spring and is great for organizations that have an existing investment in Java/J2EE and want some of the productivity enhancements of something like a Rails." And so went a recent conversation I had at a conference. While Rails has clearly made it into the mainstream techie vernacular, Grails (and Groovy) has not. Despite it leveraging proven packages like Hibernate and Spring, Grails is far from mainstream mainly due to lack of well-known implementations. I hope this changes, so I wanted to do my part and evangelize our usage.

Our decision

When we begun development on our product, we made the decision to use Java given the experience of the team and a number of 3rd party packages in Java that we were planning on using. I evaluated a number of the Java MVC frameworks including Struts, JSF, Spring. Rails was a relatively new phenomenon, but I really like its focus on Convention over Configuration. I found a package Sails that provided a simple Java-based framework that was coincidentally built by a group of local individuals whom I happened to know and respect. The basic architecture of our product was a EJB-based backend using BEA's Weblogic CMP implementation with a business tier in EJB session beans. Sails provided a simple MVC layer on top of this architecture. Over time, we decided to refactor the architecture to use EJB3 with JPA entities ( using Hibernate ). While we were reasonably happy with Sails, it is not actively developed. In fact, we were starting to lose the efficiencies we desired when we first chose it. This sparked an interest to choose something else. Grails over the past few years has continued to gain traction. And most importantly is actively developed. Now that our entities were 100% Hibernate, Grails became a very interesting option, so we decided to take the plunge for new web development. Plus Groovy was an intriguing option to gain some productivity in our web tier.

Enterprise Grails

Unlike the sample applications you'll see in books or documentation, our Grails app is just a piece of our overall product. Because we have a large and complicated entity layer, we stuck with JPA versus using GORM. More importantly, Grails isn't the only consumer of our entities. We have a business tier with Session Beans ( with Web Service interfaces ) and Message-Driven Beans. Our entities have many relationships of all types and have fairly complex inheritance schemes -- again unlike many of the examples you'll see. I wish I could say that using Grails with this architecture has been flawless. It has not. There's no question that the primary usage of the framework is via GORM with a less sophisticated entity design. However, we've managed to have success. When a bug is encountered, the newsgroup is very active. Plus we can and have changed the source to resolve issues quickly. We routinely and frustratingly fight LazyInitializationExceptions. We've worked around this using Session Bean-based DAOs that fetch all the necessary associations. We could not deploy two Grails WARs in the same EAR. We've worked around this by consolidating the two into one. Despite these issues, we are very pleased with Grails. First, we like Groovy. Although it takes getting used to, Groovy is far a more productive language than Java yet doesn't sacrifice anything. I'll be posting more about Groovy, since I think this has a number of applications that can benefit Java-based organizations. The Grails scaffolding ( when we use it ) saves us time. We have a number of administrative screens that end users never see that are almost completly generated. This is a huge help.

Wishlist

Our biggest wish for Grails is better support for JPA which appears to be scheduled "post 1.0". I doubt we are the only organization that has an existing J2EE application with JPA entities that wants to use Grails. I would bet that many of the enterprise-class J2EE applications are in similar situations. To me this should be the Grails team's focus, since teams building new applications can just as easily choose Rails. Grails is far more compelling in projects with a legacy in Java.

Summary

I'd encourage teams with existing investments in Java and J2EE to evaluate Grails. It's a compelling choice for teams interested in leveraging advances in web infrastructure without throwing away part of your application that works.

Thursday, January 10, 2008

Deleting Entities, Part 2

Just wanted to post a follow-up to my previous post on Deleting Entities now that it is resolved. Despite trying to clear out the objects from Collections that were fetched, I was unable to resolve the issue in this manner. Thankfully, I found the answer with a little more Googling. We had gotten in the habit of labeling many associations as: CascadeType.ALL which includes REFRESH, PERSIST, REMOVE, and MERGE. ALL seemed like the right choice since it was all-inclusive. However according to some of my reading, PERSIST attempts to re-save on object's state during a commit (note: the entire object graph). The exception I was seeing was when the application was attempting to save something that was supposed to be deleted. I modified the key associations ( the ones indicated in the exception ) to be CascadeType.REMOVE. That was it! I suppose the lesson here is to only add behavior you are confident that you need because there may be unintended consequences.