PGP Simplicity in Your ETL and File Workflows

Recently I’ve been working a lot on putting PGP (Pretty Good Privacy) support into Flux, and I’ve realized some pain points it can bring to a project. Check out the complexities, and problems, I outline with PGP and how Flux can be utilized to simplify and enhance an already “pretty good” security specification. Read more here!

Someone Call Security!!!

Hopefully the title grabbed your attention, but no need to worry, I’m in no immediate danger. My personal information, along with millions of other people’s, is a different story though. You don’t even have to be active in a technological community or know much of anything about technology or programming to see and understand the glaring security oversights that have popped up in mobile phones/devices (*cough*iPhone), credit card networks, and recently the Play Station Network being hacked and leaching personal information, “possibly” including credit card information. Sony has told their customers to make sure and check your bank statements for any questionable activity, just to make sure, so I think we all know what that means… Yes the bad guys do have your credit card information.

Security is a tough thing to get down, and really any security protocol is never “unbreakable”, some are just much harder to crack than others. Now I am certainly no security expert, nor am I a proficient hacker. I took a class in college about network security which also outlined hacking tools and techniques and that’s about the extent of my credentials. I have, however, learned much from working in the programming industry and from some of the glaring problems companies have had with security.

When one begins to learn how to program, usually the first thing they learn is not security, nor the second, nor the fifth, nor the 20th…. My point is security is usually not a primary subject in many curriculums. This of course depends upon the courses and focus you take, but just in general programming security is more of a specialty than it is a general topic. In my experience security becomes a topic when it is a necessity. Big companies such as the credit card companies, Microsoft, and Apple all without a doubt have security specialists and teams. Just like anything though, without proper testing of the security infrastructure, network, hardware, and code a small security loophole can turn into a huge problem. We see small bugs in software all the time, but they cause us a minor inconvenience. A security bug causes many more problems.

There are a few different types of security problems. Of course there’s the network security problem. The network is just left with a gaping hole that is unsecured and inviting hackers in for a cup of tea and crumpets. This is the equivalent of leaving your front door wide open while you take a week vacation. It’s the easiest way to leak secure information. Second you’ve got code security problems where your program just doesn’t encrypt the proper stuff, and doesn’t secure the proper parts of code or proper calls. Then you’ve also got UI security loopholes. For example a user is not an administrator but through a series of clicks they are able to get to a secure page without authenticating their credentials.

It is easy to point the finger at security teams or a company and say “what were you thinking?!”, but really it is such a complex subject and takes a vast amount of knowledge and experience to master. There are a large amount of ciphers, encryption algorithms, and protocols that using any combination of them instantly adds complexity. Topple on the layers of security for network, code, data, UI, etc. it becomes a rather daunting task. That all being said, I’m not going to forgive someone for handing out my credit card information. I think what I’m getting at is that with all this complexity there needs to be a better focus on security within these mediums and platforms instead of saying “yeah this looks pretty good to me” and then find out there’s a huge loophole later on. I mean come on, it’s not like if someone really wanted to they could hack my cellphone and accounts and get my absolute location in real time, spending tendencies, home address…….. oh wait. I suppose a whole different topic for another blog!

Flux Custom Functionality Integration: Unzip File Action

With any software, you’re never going to find something that does everything. Flux comes close, but we still have yet to write the “Laundry” and “Cooking” actions. ;) The best way to handle this in any API or framework is to make user extension seamless and easy to access. Flux does this beautifully with Custom Actions. We use the simple JavaBean Framework to integrate custom user created actions into core Flux functionality, and provide the developer with easy to program hooks into our system so that programming virtually any functionality into a Flux action becomes a snap!

Some custom functionality that has been requested as of late is support for unzipping a zip archive file using Flux. Below I will show you how easy it is to pop in this functionality. If you are interested in using this custom action and would like a copy of the source code you can download the zip file here. If you are interested in simply using this custom action in Flux and are not worried about the source, you can download the jar file here. Just place the downloaded jar file onto your engine’s classpath and you’re all set!

public interface UnzipFileAction extends Action {
public void setDestination(String path);
 public void setZipFile(String path);
}

The interface in this case extends Action to include all the Flux Action functionality. If you were writing a Trigger then you’d need to extend the Trigger interface.

Next comes our implementation. In your implementing class is where the bulk of the logic will go as to what your action will be doing. The execute(FlowContext flowContext) method is the tie in for the action’s logic. This is where all the good stuff happens:


 public Object execute(FlowContext flowContext) throws Exception {
    UnzipFileVariable var = getVariable();
    String base = var.getDestination();
    if (base == null || base.equalsIgnoreCase(".")) {
      base = "";
    }
    if (!base.equalsIgnoreCase("") && !base.endsWith("/")) {
      base += "/";
    }
    ZipFile zipFile = new ZipFile(new File(var.getZipFile()), ZipFile.OPEN_READ);
    final Enumeration<!--? extends ZipEntry--> entries = zipFile.entries();
    byte buffer[] = new byte[51200];
    int bytesRead;
    while (entries.hasMoreElements()) {
      ZipEntry entry = entries.nextElement();
      File fileEntry = new File(base + entry.getName());
      if (entry.isDirectory()) {
        fileEntry.mkdirs();
      } else {
        InputStream in = zipFile.getInputStream(entry);
        if (!fileEntry.createNewFile()) {
          fileEntry.delete();
          fileEntry.createNewFile();
        }
        FileOutputStream out = new FileOutputStream(fileEntry);
        int offset = 0;
        while ((bytesRead = in.read(buffer)) != -1) {
          out.write(buffer, offset, bytesRead);
        }
      }
    }
    return null;
  }

The above code is all the logic to create any directories necessary, locate the zip file, and extract it to the destination. The only thing that might be considered tricky when implementing a custom action or trigger is how Flux saves the action’s state. For all actions Flux uses a variable that is written to the database at runtime. Now that you know this however, writing the variable is extremely easy. The Unzip Action’s variable is as follows:

public class UnzipFileVariable implements Serializable {
  public String destination;
  public String zipFile;

  public UnzipFileVariable() {
  }

  public String getDestination() {
    return destination;
  }

  public void setDestination(String destination) {
    this.destination = destination;
  }

  public String getZipFile() {
    return zipFile;
  }

  public void setZipFile(String zipFile) {
    this.zipFile = zipFile;
  }
}

Notice the variable implements no other interface except the Java Serializable interface. This is import as Flux needs to be able to serialize your variable into the database. Variables unable to be serialized to the database will cause problems at runtime. The next step is using your variable within your action/trigger implementation. It must be used within your property getters and setters so that Flux is able to generically access and save your variable to the database. The getters and setters for the Unzip Action are as follows:

  public static String VARIABLE_NAME = "UNZIP_FILE";

public String getDestination() {
    return getVariable().getDestination();
  }

  public void setDestination(String path) {
    UnzipFileVariable var = getVariable();
    var.setDestination(path);
    putVariable(var);
  }

  public String getZipFile() {
    return getVariable().getZipFile();
  }

  public void setZipFile(String path) {
    UnzipFileVariable var = getVariable();
    var.setZipFile(path);
    putVariable(var);
  }

  private UnzipFileVariable getVariable() {
    if (!getVariableManager().contains(VARIABLE_NAME)) {
      getVariableManager().put(VARIABLE_NAME, new UnzipFileVariable());
    } // if

    return (UnzipFileVariable) getVariableManager().get(VARIABLE_NAME);
  } // getVariable()

  private void putVariable(UnzipFileVariable variable) {
    getVariableManager().put(VARIABLE_NAME, variable);
  } // putVariable()

The getVariable() and putVariable(UnzipFileVariable variable) methods are put in for convenience with getting and setting the variable back to the variable manager each time. The VARIABLE_NAME variable can be set to whatever you like.

Next the verify() method needs to be implemented. This is a pretty straight forward method as it is used by Flux to ensure all properties needed for the proper execution of your action/trigger are set. The Unzip Action simply ensures the zip file path has been set. The destination is not required as the action is designed to use the working directory of Flux as a default destination.

public void verify() throws EngineException {
    UnzipFileVariable var = getVariable();
    if (var.getZipFile() == null || var.getZipFile().equalsIgnoreCase("")) {
      throw new EngineException("Expected zip file path to be non-null or not empty, but it was");
    }
  }

The next step is creating your BeanInfo. Your BeanInfo will instruct Flux on how to handle and use your custom action/trigger just like any other core action/trigger Flux ships with.

public class UnzipFileActionImplBeanInfo extends ActionImplBeanInfo {
  /**
   * A small image for this custom action to be displayed in the Flux GUI.
   */
  private Image smallImage;

  /**
   * A large image for this custom action to be displayed in the Flux GUI.
   */
  private Image bigImage;

  /**
   * This BeanInfo needs a BeanDescriptor.
   */
  protected BeanDescriptor bd = new BeanDescriptor(UnzipFileActionImpl.class);

  /**
   * This BeanInfo needs a BeanDescriptor.
   */
  public UnzipFileActionImplBeanInfo(BeanDescriptor bd) {
    this.bd = bd;
  } // constructor

  /**
   * This BeanInfo needs a BeanDescriptor.
   */
  public UnzipFileActionImplBeanInfo() {
    // setup bean descriptor in constructor
    bd.setDisplayName("Unzip File Action");
    bd.setShortDescription("Unzips a specified zip archive file.");
  } // constructor

  /**
   * Returns this BeanInfo's BeanDescriptor.
   *
   * @return This BeanInfo's BeanDescriptor.
   */
  public BeanDescriptor getBeanDescriptor() {
    return bd;
  } // getBeanDescriptor()

  /**
   * You can change the icon associated with your custom action/trigger here.
   */
  public Image getIcon(int type) {
    if (type == BeanInfo.ICON_COLOR_16x16) {
      return getSmallImage();
    } // if
    if (type == BeanInfo.ICON_COLOR_32x32) {
      return getBigImage();
    } // if
    return null;
  } // getIcon()

  /**
   * Returns all JavaBean descriptors described by this BeanInfo.
   *
   * @return All JavaBean descriptors described by this BeanInfo.
   */
  public PropertyDescriptor[] getPropertyDescriptors() {
    Vector descriptors = new Vector(Arrays.asList(super.getPropertyDescriptors()));
    PropertyDescriptor nameDescriptor = null, contentDescriptor = null;
    try {
      nameDescriptor = new PropertyDescriptor("zipFile", UnzipFileActionImpl.class);
      nameDescriptor.setDisplayName("Zip File");
      nameDescriptor.setShortDescription("Zip file path");
      descriptors.add(nameDescriptor);

      contentDescriptor = new PropertyDescriptor("destination", UnzipFileActionImpl.class);
      contentDescriptor.setDisplayName("Destination");
      contentDescriptor.setShortDescription("Destination directory path");
      descriptors.add(contentDescriptor);
    } // try
    catch (IntrospectionException e) {
      e.printStackTrace();
      throw new IllegalStateException("could not create property descriptor");
    } // catch

    return (PropertyDescriptor[]) descriptors.toArray(new PropertyDescriptor[descriptors.size()]);
  } // getPropertyDescriptors()

  /**
   * Returns a small image that represents this JavaBean.
   *
   * @return A small image that represents this JavaBean.
   */
  private Image getSmallImage() {
    if (smallImage == null) {
      smallImage = loadImage("/fluximpl/resources/images/action_icons/default_small.png");
    } // if
    return smallImage;
  } // getSmallImage()

  /**
   * Returns a large image that represents this JavaBean.
   *
   * @return A large image that represents this JavaBean.
   */
  private Image getBigImage() {
    if (bigImage == null) {
      bigImage = loadImage("/fluximpl/resources/images/action_icons/default_big.png");
    } // if
    return bigImage;
  } // getBigImage()
}

The Unzip Action’s BeanInfo extends ActionImplBeanInfo which is the base BeanInfo for Flux actions. This provides you with the other necessary property descriptors your custom action uses. Notice the getPropertyDescriptors() method. This is where you define what your custom properties are for your action/trigger.

Finally the last bit of implementation is the factory with which you create the action from.


public class UnzipFileFactory implements AdapterFactory {
  private FlowChartImpl flowChart;

  public void init(FlowChartImpl flowChart) {
    this.flowChart = flowChart;
  }

  public UnzipFileAction makeFileAction(String name) {
    return new UnzipFileActionImpl(flowChart, name);
  }
}

This class implements AdapterFactory which is what you need in order for Flux to recognize your factory as a Flux factory object. The last thing you’ll need to do is tie in your factory with the factories.properties file. The file has a single entry for this action, UnzipFileFactory=unzip.UnzipFileFactory. As long as this properties file is on your engine’s classpath, Flux will pick up and recognize your new factory!

Now to actually use the Unzip Action. The following code creates a flow chart with the custom Unzip File Action and a Console Action. Notice how the UnzipFileFactory is created and used.

public class Main {

  /**
   * Creates a simple Flux engine.
   */
  public Main() throws Exception {

    // First, make a Flux engine.
    Factory factory = Factory.makeInstance();
    Engine engine = factory.makeEngine();

    // Flux models jobs using flow charts. Create a flow chart.
    EngineHelper helper = factory.makeEngineHelper();
    FlowChart flowChart = helper.makeFlowChart("Custom Action");

    // Our flow chart will consist of a custom action. Here is how we
    // load custom actions and triggers.

    // Create a file action to append data to a file.
    UnzipFileFactory unzipFileFactory = (UnzipFileFactory) flowChart.makeFactory("UnzipFileFactory");
    UnzipFileAction unzipFileAction = unzipFileFactory.makeFileAction("Unzip File Action");
    unzipFileAction.setZipFile("flux-7-10-3.zip");
    unzipFileAction.setDestination("extraction");

    // Create a console action to print strings to the console.
    ConsoleAction consoleAction = flowChart.makeConsoleAction("Console Action");
    consoleAction.setMessage("Finished unzipping contents of zip file.");

    // Flow from the file action to the console action.
    unzipFileAction.addFlow(consoleAction);

    // Schedule the job for immediate execution.
    engine.put(flowChart);

    // Start the engine
    engine.start();

    // Wait for the flow chart to complete
    engine.join("/", "+5m", "+2s");

    // Finally, shutdown the engine.
    engine.dispose();

  } // constructor

  public static void main(String[] args) throws Exception {
    new Main();
  } // main()
}

That’s all there is to it! I hope you can use this custom action in your Flux activities, or at least have it help you in creating new custom actions of your own!

Adapting Frameworks to Legacy Code

For the past couple weeks I’ve been working on getting a template engine in to Flux substitution code. The idea of a template engine fits in perfectly with the functionality of our variable substitution, and provides us with a much more powerful feature. The problems begin to creep in when the framework you are attempting to use just doesn’t fit in as nicely as you’d have hoped. I took a look at two template engines, Apache Velocity and FreeMarker. Apache Velocity was our first choice and so I began to do the research on it and play around with it. It turned out that I could adapt it relatively well into our code. Because template engines, a lot of the time, are used for webapps and templating html files, adapting it in to pure Java and setting it up to use a passed in String was tricky. It was not nearly as well documented and was more or less a trial and error process to get it set up as such.

Once it was in the code it was time to “adapt” it to our needs. With our current substitution we of course have to keep previous syntax working for backwards compatibility, and that poses a lot of problems when attempting to plug in a framework. First of all, I can’t just pass strings in because the template engine would have no idea what to do with our syntax. Velocity and FreeMarker give the user a decent amount of adaptability but not enough to get around these syntax problems. I had to end up parsing the strings myself first and then passing them in to the template engine, which was almost defeating the ease and purpose of using a template engine in the first place! Even though I had to parse everything before hand and adapt our syntax into something the template engine could understand we were still getting a lot of good functionality like method calling in substitution. Finally, after some adaption code I was able to fully use Flux and Velocity in conjunction with each other, and it was working splendidly. Now, since this substitution feature is a core part of what Flux does we require that users only need the flux.jar file, and no other jar files, on their class path to use the core features of Flux. This means we would need to fold the Apache Velocity library into the flux.jar file. At first, I figured why not rename the source code of Velocity since it is open source to something that will not conflict with user’s other Velocity versions and then just throw that into the Flux jar. The code was renamed without incident and plugged in, however, Velocity uses default properties files that must also be renamed as they reference class files by full package names. From there it became a snowball effect of awful as these problems compound and soon I was attempting to rename and stuff 5 jars into the Flux jar file. At that point I attempted to use some tools that claim to do just these things. Jar Jar Links was one of those tools and did not even come close to doing everything it needed to to get the jars into flux.jar and working properly. Secondly I tried going the obfuscation route. This turned out just as useful as the obfuscater we use seemed to want to leave Velocity in the clear even after being told not to.

With these problems at hand I turned to FreeMarker, a single library template engine that had all the functionality of Velocity. This, however, put me back to square one for implementation. Even though the two libraries are very similar, they are not a one to one drop in replacement for each other. Some behavior in one is not the same as the other. This meant more adaptation code for me. After hours of careful thought and white knuckle coding FreeMarker was finally working. Although we have not officially put FreeMarker into the Flux jar, it will be easily accomplished as it is a single jar library.

This long and usually arduous process of adapting a framework into existing code is one that is done by many many programmers. In the end, picking the right framework is a very big part of making this process easy and pain free. Also, make sure you plan out every aspect of the integration. For instance, I did not think renaming and throwing Apache Velocity into the Flux jar file was going to be that bad. It sounded easy, but I did not necessarily think about how I would do it. And in the end that ended up coming back to haunt me. Now that it is in, though, don’t forget to check it out in Flux 7.11 so all this hard work does not go unused! ;)

The necessities of a strong file transfer solution (Part 3)

The third and final point I would like to make about solid file transfer solutions is transfer speed. This point is never more fresh in my mind than right now as I am working on optimizing Flux file transfer speeds. Customers will argue that since their standalone file client (such as Filezilla) can transfer files at a certain speed, why can’t Flux match that speed? The argument certainly makes sense, however, it’s not that simple. Unlike a standalone file client, Flux is an infinitely more complex piece of software. It not only has to have all the abilities of a standalone file client, but do them at the same rate of speed while handling numerous other running flow charts and processes simultaneously. For the sake of simplicity, I am factoring out the fact that Flux can be “overloaded” and slowed down. If you put a large load on to anything, you can expect reasonable slowdown. To get to the bottom of Flux file transfer optimization and comparisons to other standalone clients, I have to strip away the layers of complexity and focus on one thing, Flux as a standalone file transfer client.

First up were the speed tests. Flux uses a 3rd party software to do all the low level transferring for FTP, SFTP, and FTPS servers, which is what my main focus is on. Local files are always fast and no complaints have been posted for UNC transfer speeds. We ourselves have seen no cause for concern regarding the UNC speeds in tests as well. I used 3 clients other than Flux: our 3rd party code (standalone without Flux), muCommander, and Filezilla. Using the 3rd party code by itself also gave me a great way to test if there were any bottlenecks in the Flux code itself, because if Flux was transferring much slower than the 3rd party code then we would know something in Flux was holding it up. Initial tests looked good with an FTP server on my local network. Once I moved away from a local FTP server is where Flux performance began to degrade from the competition. This immediately screams “buffer size”. The smaller buffer size would mean Flux would transfer less each trip across the wire than its competitors, thus slowing Flux down.

Once the potential problem area was identified, it was time to break in to the code and began looking at how to fix it. In this case, the buffer size is incredibly easy to adjust. Since I already had my tests set up for my metrics I began researching proper buffer size and how to calculate such a number. I came to the conclusion there really isn’t a set number, it all depends upon your network. The buffer was sitting at around 8 KB. I decided to bump it up to 1 MB. This brought the file transfer to a crawl, and was clearly not pointing to the “bigger is better” solution. Next I decided to go a bit more small scale at 50 KB. This proved to be the sweet spot as Flux file transfer flew past both Filezilla and muCommander. Just to be sure I did another test with 10 KB, which was faster than the original but not as fast as 50 KB.

With my test results looking good and my code change in, it is off to QA for further analysis. In the end, we will likely make the file transfer buffer size in Flux configurable, allowing the user to tweak the option to optimize Flux based on their network. Verifying that the buffer size was the problem was 90% of the battle, and just goes to show how such a minor, one line of code, detail can have such a big effect.

The necessities of a strong file transfer solution (Part 2)

If you missed my previous post, you can find it here.

Continuing on with this topic, today I’ll be talking a little about security and what it means for our file transfer solution. Security is such a vital part of any file transfer solution and is a requirement of almost any user. The problem with security is that it can vary so widely. From data encryption, to file encryption, to password encryption and user authorization, a good file transfer suite has to encompass all of these.

In Flux we have support for 5 different types of hosts:

  • Local
  • FTP
  • SSH FTP (SFTP)
  • FTP over SSL (FTPS)
  • UNC/SMB

SFTP and FTPS give our users the data encryption support they require. All hosts except local provide username and password authentication support. FTPS also has settings to choose which encryption type you wish to use (TLS, SSL, and Implicit SSL), along with support for a key file. This security support has been adequate for our customers needs, although, every now and then a customer comes along with a security requirement we do not have. From these requests we have been taking measures to make Flux file transfer a more secure feature.

One of the bigger enhancements to Flux file transfer security that was recently done is the encryption of passwords. In earlier versions, passwords were left in the clear and could show up in logs, as well as other places in Flux. This was a big problem with some customers (and understandably so). Now no passwords are ever shown in the clear and always encrypted.

Not too long ago a customer had a requirement for Flux to use an encrypted key file. They needed the key file for the FTPS transfer to be encrypted on the file system, and for that file to be passed into Flux where Flux could decrypt the key file, then use that key for the FTPS transfer authentication. Obviously they have a high need for security and are an extreme case, but it goes to show just how crazy some people get with security and to what measures they take it.

With Flux we are always looking to make our file transfer suite more secure, and make sure every user has the security they require. We continue to gather information and needs from customers, and move towards PCI compliance.

The necessities of a strong file transfer solution (Part 1)

In Flux, our file transfer suite is a big seller for a lot of clients. There are many applications and use cases that involve file transfer and when used with Flux it provides clients a way to automate entire file transfer workflow use cases easily and efficiently. This point alone has drawn many clients into our product. Since I predominantly oversee a lot of Flux’s file transfer code and functionality, over the years I have learned a lot about the wants and needs of our file transfer users. The three things that have constantly popped up, and are prevalent in other file transfer solutions, are as follows:

  • Quick processing of files
  • Security/PCI compliance
  • Fast transfer speeds

The Requirement: Quick Processing Of Files

The idea here is that once a file comes in, and the user is “monitoring” for that file, the file will be picked up and processed as quickly as possible. This functionality Flux handles using a “File Trigger”. File triggers watch for files, and once the file is there, or is not there depending upon which trigger you are using, it fires. The problem we often have with these triggers is that they are polling based. That means a user can set a specific interval at which they want to poll the directory to look for the file. The problem intrinsic to polling behavior is that you have to walk the fine line between polling too often and saturating processing, or not polling often enough and leaving plenty of processing power available. Polling does not scale well and users often set the polling delay to a low interval and then complain about too high of CPU usage or too much demand on the Flux engine.

The Solution

What we would want to do is move away from a polling system and into an event registration system. This way we are not using any unnecessary  processing power by polling frequently, but can still achieve a quick turnaround when a file arrives. Setting up an event based registration system for file processing is a bit tricky, especially since Flux can run on many different operating systems. We have brainstormed the idea of using File Agents to register with the OS and be notified of a file coming in, being edited, or being deleted. This would give us real time processing abilities with files. The best part is that you can install these file agents on any machine, and they can report back to the Flux engine once an event has occurred. This would take out polling entirely and give us quicker file processing. The problem is you would have to implement different registration techniques with different operating systems. This begins to complicate the solution for the client as they would need to run different file agents for different machines. This might be taken care of by automatically detecting the operating system and creating the appropriate agent for the machine. This of course, could be overridden  by the user with a system variable or configuration option to specify exactly which OS the agent is running on.

An exciting new feature called a WatchService in Java 7 will solve these problems for us. It will be able to register with the low level OS code and give us notification driven processing. Java 7 is due to release July 28, 2011. Whether or not we can wait that long to replace our polling system is yet to be determined…

 

I will continue on our plans to improve the Flux file transfer suite in subsequent posts. Stay tuned ;)

To be or not to be, that is the question (at least for Java)

In somewhat recent news, a big blow to Java developers and enthusiasts around the world came in the form of an apple. Apple announced that they were deprecating Java from their OS. This brought a lot of concern as to the future of Java and its role in the development community. I myself deal primarily with Java and write it on a daily basis. I was taught Object Oriented design primarily in Java as well, although I do of course understand the concepts of programming in OO design and have worked in other languages such as C++ and C#. Being so deeply immersed in the Java community, hearing the news that a company as large and popular as Apple is breaking up with Java gave me a brief feeling of dread. What will become of this language which I have grown to love and enjoy so thoroughly?

I read an interesting article on javaworld.com where the author, Savio Rodrigues, rationalizes the fact that Java is indeed safe. He outlines the problems the Java community has run into but states simply that Java is “too big to fail”. I agree with his statement in that Java has a huge following not only with development communities but commercial usage and application as well. Now of course like anything, Java will not be around forever, and eventually it will run its course. Whether or not Apple has pushed Java one step closer to eradication is tough to say this early on. Who knows what Apple/Java developer/users will do with the deprecation. Will open source JDKs become much more powerful due to the ending of the Apple JDK? Will Java become phased out of the Apple world for good? Tough questions to answer indeed with a LOT of different opinions and facts. I, for one, would hate to see Java go and you can bet I’ll be right there along side the Java advocates within that very large community :)

Dancing with the Database: Normalization on ACID

Throughout this last month I’ve been doing a lot of work on honing the performance of a few features in Flux. One dealt with our Run History feature and the other with Derby deadlocks. Looking deeper into these problems I came to the conclusion that our database design had a few flaws when it came to these features (nobody is perfect right?!). This required me to strap on my database engineer hat and begin racking my brain for the information I absorbed in a few of my database design classes back in college.

Normalization and ACID are always the first things to come to my mind when dealing with database systems. As you may, or may not, know Normalization of the database ensures that your application does not have the ability to put any of your tables in an inconsistent sate. ACID (Atomicity, Consistency, Isolation, Durability) is more of a database systems design rule and not directly applicable to the client application accessing the database, however, it is still relevant. The database lecture aside, I began to analyze the problems from the database’s point of view.

The Run History feature came first. It is designed to return to the user accurate metrics of flow chart runs. The problem was that once you begin to build up a large amount of runs, which is easy in a high throughput Flux environment, the performance of the feature fell to pieces. Looking deeper into the problem it was evident the performance bottleneck originated from the database. Within one table there were two different data types being represented. In theory, having all the data in one table should help performance as you don’t have to join tables, or issue multiple queries, to retrieve the data you can get with one query on a single table. In practice, once the table begins to grow database locks and database system analysis of the returned data becomes sluggish. By breaking the table apart I was able to reduce the size of the tables, keep data integrity, and reduce the amount of time it took to retrieve our Run History data. Thus performance was improved.

Next was the deadlock. This Derby database deadlock occurred due to a join on two of our heavily accessed tables. Although not ideal in a database system (theory is only good….well in theory), the solution I used was to have information from our joined table also put in to our joining table so that all the information was able to be accessed without the join. This freed up locks in both tables, relieving the deadlock scenario.

So ends my database adventure….for now.

Follow

Get every new post delivered to your Inbox.