Java Resources

The Latest Java Topics

Last time we familiarized ourselves with ListenableFuture. I promised to introduced more advanced techniques, namely transformations and chaining. Let's start from something straightforward. Say we have our ListenableFuture which we got from some asynchronous service. We also have a simple method: Document parse(String xml) {//... We don't need String, we need Document. One way would be to simply resolve Future (wait for it) and do the processing on String. But much more elegant solution is to apply transformation once the results are available and treat our method as if was always returning ListenableFuture. This is pretty straightforward: final ListenableFuture future = //... final ListenableFuture documentFuture = Futures.transform(future, new Function() { @Override public Document apply(String contents) { return parse(contents); } }); or more readable: final Function parseFun = new Function() { @Override public Document apply(String contents) { return parse(contents); } }; final ListenableFuture future = //... final ListenableFuture documentFuture = Futures.transform(future, parseFun); Java syntax is a bit limiting, but please focus on what we just did. Futures.transform() doesn't wait for underlying ListenableFuture to apply parse() transformation. Instead, under the hood, it registers a callback, wishing to be notified whenever given future finishes. This transformation is applied dynamically and transparently for us at right moment. We still have Future, but this time wrapping Document. So let's go one step further. We also have an asynchronous, possibly long-running method that calculates relevance (whatever that is in this context) of a given Document: ListenableFuture calculateRelevance(Document pageContents) {//... Can we somehow chain it with ListenableFuture we already have? First attempt: final Function> relevanceFun = new Function>() { @Override public ListenableFuture apply(Document input) { return calculateRelevance(input); } }; final ListenableFuture future = //... final ListenableFuture documentFuture = Futures.transform(future, parseFun); final ListenableFuture> relevanceFuture = Futures.transform(documentFuture, relevanceFun); Ouch! Future of future of Double, that doesn't look good. Once we resolve outer future we need to wait for inner one as well. Definitely not elegant. Can we do better? final AsyncFunction relevanceAsyncFun = new AsyncFunction() { @Override public ListenableFuture apply(Document pageContents) throws Exception { return calculateRelevance(pageContents); } }; final ListenableFuture future = //comes from ListeningExecutorService final ListenableFuture documentFuture = Futures.transform(future, parseFun); final ListenableFuture relevanceFuture = Futures.transform(documentFuture, relevanceAsyncFun); Please look very carefully at all types and results. Notice the difference between Function and AsyncFunction. Initially we got an asynchronous method returning future of String. Later on we transformed it to seamlessly turn String into XML Document. This transformation happens asynchronously, when inner future completes. Having future of Document we would like to call a method that requires Document and returns future of Double. If we call relevanceFuture.get(), our Future object will first wait for inner task to complete and having its result (String -> Document) will wait for outer task and return Double. We can also register callbacks on relevanceFuture which will fire when outer task (calculateRelevance()) finishes. If you are still here, the are even more crazy transformations. Remember that all this happens in a loop. For each web site we got ListenableFuture which we asynchronously transformed to ListenableFuture. So in the end we work with a List>. This also means that in order to extract all the results we either have to register listener for each and every ListenableFuture or wait for each of them. Which doesn't progress us at all. But what if we could easily transform from List> to ListenableFuture>? Read carefully - from list of futures to future of list. In other words, rather than having a bunch of small futures we have one future that will complete when all child futures complete - and the results are mapped one-to-one to target list. Guess what, Guava can do this! final List> relevanceFutures = //...; final ListenableFuture> futureOfRelevance = Futures.allAsList(relevanceFutures); Of course there is no waiting here as well. Wrapper ListenableFuture> will be notified every time one of its child futures complete. The moment the last child ListenableFuture completes, outer future completes as well. Everything is event-driven and completely hidden from you. Do you think that's it? Say we would like to compute the biggest relevance in the whole set. As you probably know by now, we won't wait for a List. Instead we will register transformation from List to Double! final ListenableFuture maxRelevanceFuture = Futures.transform(futureOfRelevance, new Function, Double>() { @Override public Double apply(List relevanceList) { return Collections.max(relevanceList); } }); Finally, we can listen for completion event of maxRelevanceFuture and e.g. send results (asynchronously!) using JMS. Here is a complete code if you lost track: private Document parse(String xml) { return //... } private final Function parseFun = new Function() { @Override public Document apply(String contents) { return parse(contents); } }; private ListenableFuture calculateRelevance(Document pageContents) { return //... } final AsyncFunction relevanceAsyncFun = new AsyncFunction() { @Override public ListenableFuture apply(Document pageContents) throws Exception { return calculateRelevance(pageContents); } }; //... final ListeningExecutorService pool = MoreExecutors.listeningDecorator( Executors.newFixedThreadPool(10) ); final List> relevanceFutures = new ArrayList<>(topSites.size()); for (final URL siteUrl : topSites) { final ListenableFuture future = pool.submit(new Callable() { @Override public String call() throws Exception { return IOUtils.toString(siteUrl, StandardCharsets.UTF_8); } }); final ListenableFuture documentFuture = Futures.transform(future, parseFun); final ListenableFuture relevanceFuture = Futures.transform(documentFuture, relevanceAsyncFun); relevanceFutures.add(relevanceFuture); } final ListenableFuture> futureOfRelevance = Futures.allAsList(relevanceFutures); final ListenableFuture maxRelevanceFuture = Futures.transform(futureOfRelevance, new Function, Double>() { @Override public Double apply(List relevanceList) { return Collections.max(relevanceList); } }); Futures.addCallback(maxRelevanceFuture, new FutureCallback() { @Override public void onSuccess(Double result) { log.debug("Result: {}", result); } @Override public void onFailure(Throwable t) { log.error("Error :-(", t); } }); Was it worth it? Yes and no. Yes, because we learned some really important constructs and primitives used together with futures/promises: chaining, mapping (transforming) and reducing. The solution is beautiful in terms of CPU utilization - no waiting, blocking, etc. Remember that the biggest strength of Node.js is its "no-blocking" policy. Also in Netty futures are ubiquitous. Last but not least, it feels very functional. On the other hand, mainly due to Java syntax verbosity and lack of type inference (yes, we will jump into Scala soon) code seems very unreadable, hard to follow and maintain. Well, to some degree this holds true for all message driven systems. But as long as we don't invent better APIs and primitives, we must learn to live and take advantage of asynchronous, highly parallel computations. If you want to experiment with ListenableFuture even more, don't forget to read official documentation.

March 7, 2013

by Tomasz Nurkiewicz

· 23,499 Views · 1 Like

ListenableFuture in Guava

ListenableFuture in Guava is an attempt to define consistent API for Future objects to register completion callbacks. With the ability to add callback when Future completes, we can asynchronously and effectively respond to incoming events. If your application is highly concurrent with lots of future objects, I strongly recommend using ListenableFuture whenever you can. Technically ListenableFuture extends Future interface by adding simple void addListener(Runnable listener, Executor executor) method. That's it. If you get a hold of ListenableFuture you can register Runnable to be executed immediately when future in question completes. You must also supply Executor (ExecutorService extends it) that will be used to execute your listener - so that long-running listeners do not occupy your worker threads. Let's put that into action. We will start by refactoring our first example of web crawler to use ListenableFuture. Fortunately in case of thread pools it's just a matter of wrapping them using MoreExecutors.listeningDecorator(): ListeningExecutorService pool = MoreExecutors.listeningDecorator(Executors.newFixedThreadPool(10)); for (final URL siteUrl : topSites) { final ListenableFuture future = pool.submit(new Callable() { @Override public String call() throws Exception { return IOUtils.toString(siteUrl, StandardCharsets.UTF_8); } }); future.addListener(new Runnable() { @Override public void run() { try { final String contents = future.get(); //...process web site contents } catch (InterruptedException e) { log.error("Interrupted", e); } catch (ExecutionException e) { log.error("Exception in task", e.getCause()); } } }, MoreExecutors.sameThreadExecutor()); } There are several interesting observations to make. First of all notice how ListeningExecutorService wraps existing Executor. This is similar to ExecutorCompletionService approach. Later on we register custom Runnable to be notified when each and every task finishes. Secondly notice how ugly error handling becomes: we have to handle InterruptedException (which should technically never happen as Future is already resolved and get() will never throw it) and ExecutionException. We haven't covered that yet, but Future must somehow handle exceptions occurring during asynchronous computation. Such exceptions are wrapped in ExecutionException (thus the getCause() invocation during logging) thrown from get(). Finally notice MoreExecutors.sameThreadExecutor() being used. It's a handy abstraction which you can use every time some API wants to use an Executor/ExecutorService (presumably thread pool) while you are fine with using current thread. This is especially useful during unit testing - even if your production code uses asynchronous tasks, during tests you can run everything from the same thread. No matter how handy it is, whole code seems a bit cluttered. Fortunately there is a simple utility method in fantastic Futures class: Futures.addCallback(future, new FutureCallback() { @Override public void onSuccess(String contents) { //...process web site contents } @Override public void onFailure(Throwable throwable) { log.error("Exception in task", throwable); } }); FutureCallback is a much simpler abstraction to work with, resolves future and does exception handling for you. Also you can still supply custom thread pool for listeners if you want. If you are stuck with some legacy API that still returns Future you may try JdkFutureAdapters.listenInPoolThread() which is an adapter converting plain Future to ListenableFuture. But keep in mind that once you start using addListener(), each such adapter will require one thread exclusively to work so this solution doesn't scale at all and you should avoid it. Future future = //... ListenableFuture listenableFuture = JdkFutureAdapters.listenInPoolThread(future); Once we understand the basics we can dive deeply into biggest strength of listening futures: transformations and chaining. This is advanced stuff, you have been warned.

March 4, 2013

by Tomasz Nurkiewicz

· 41,209 Views · 3 Likes

SAP Integration with Talend Components / Connectors (BAPI, RFC, IDoc, BW, SOAP)

talend has several connectors to integrate sap systems. however, this guide is no introduction to talend’s sap components. instead, this guide helps to understand different alternatives to integrate sap systems with talend set up a local sap system configure talend studio for using sap components use talend’s sap wizard run a first talend job which connects to sap all further required information and example use cases for talend’s sap components should be available in the talend component guide at www.help.talend.com . if that’s not the case, please create a jira documentation ticket ( https://jira.talendforge.org/browse/doct )! now let’s take a look at different alternatives for integration of sap systems with talend. alternatives for sap integration three protocols exist for communication between sap and external programs: dynamic information and action gateway (diag): e.g. used by sap gui remote function call (rfc): a function call with input and output parameters (like a java interface) hypertext transfer protocol (http): internet standard the following alternatives are available for integrating sap systems using some of these protocols. file sap supports the direct import of files (call-transaction-program, batch-input, direct input). files have to be in a specific format to be imported. transformation and integration can be realized with talend’s various file components such as tfileinputdelimited. rfc remote function call is the proprietary sap ag interface for communication between a sap system and other sap or third-party compatible system over tcp/ip or cpi-c connections. remote function calls may be associated with sap software and abap programming, and provide a way for an external program (written in languages such as php, asp, java, or c, c++) to use data returned from the server. data transactions are not limited to getting data from the server, but can insert data into server records as well. sap can act as the client or server in an rfc call. a remote function call (rfc) is the call or remote execution of a remote function module in an external system. in the sap system, these functions are provided by the rfc interface system. the rfc interface system enables function calls between two sap systems, or between a sap system and an external system. tsapinput and tsapoutput are talend’s components to use rfcs. business application programming interface (bapi) a bapi is an object-oriented view on most data and transactions of a sap system (called “business objects”). object types of the business objects are stored in the business object repository (bor). bapis are always implemented as rfcs and therefore can be called the same way. additionally, they have the following characteristics (compared to rfcs): stable interface no view layer no exceptions, instead export parameter: “return” most business objects offer the following standard bapis: getlist getdetail change creationfromdata tsapinput and tsapoutput are talend’s components to use bapis. application link enabling (ale) application link enabling (ale) is used for asynchronous messaging between different systems via “intermediate documents (idoc)”. idoc is a sap document format for business transaction data transfers. it is used to realize distributed business processes. idoc is similar to xml in purpose, but differs in syntax. both serve the purpose of data exchange and automation in computer systems, but the idoc technology takes a different approach. while xml allows having some metadata about the document itself, an idoc is obligated to have information at its header like its creator, creation time, etc. while xml has a tag-like tree structure containing data and meta-data, idocs use a table with the data and meta-data. idocs also have a session that explains all the processes which the document passed or will pass, allowing one to debug and trace the status of the document. an idoc consists of control record (it contains the type of idoc, port of the partner, release of sap r/3 which produced the idoc, etc.) data records of different types. the number and type of segments is mostly fixed for each idoc type, but there is some flexibility (for example an sd order can have any number of items). status records containing messages such as 'idoc created', 'the recipient exists', 'idoc was successfully passed to the port', 'could not book the invoice because...' different idoc types are available to handle different types of messages. for example, the idoc format orders01 may be used for both purchase orders and order confirmations. tsapidocinput and tsapidocoutput are talend’s components to use ale / idoc. bapis can also be called asynchronously via ale. all new idocs are even based on bapis. soap web services sap supports soap web services. not just sap as java, but also sap as abap! integration can be realized with talend’s esb / web service components such as tesbrequest, tesbresponse, or tesbconsumer. installation of sap server and client installation can take about 6 to 8 hours, but it is an “all in one installation”, i.e. you can install it overnight. steps for installation: get yourself a windows 7 64 bit laptop or vm with 8+ gb ram and 50+gb free disc space get a sap community account (for free, just register): http://scn.sap.com/welcome download sap netweaver (software downloads --> sap netweaver main releases: http://www.sdn.sap.com/irj/scn/nw-downloads download current version of sap netweaver application server abap 64-bit trial install sap server: follow installation guide – a html website included in the download in root of extracted download folder (start.htm --> there click on “installation” link) install sap gui (rich client frontend): start.htm --> there click on “install sap gui” link and follow instructions download the sap jco for the operating system on which your connector is running. the sap jco is available for download from sap's website at http://service.sap.com/connectors . you must have an sapnet account to access the sap jco (if you do not already have one, contact your local sap basis administrator). usage of sap server hint: you have to use a windows user which has a password (as you need to enter windows credentials when stopping sap). if you have a windows user without a password (for instance if you use windows within a vm on your mac), sap cannot process these credentials (i.e. it cannot process an empty password field) --> change your windows password before starting sap start the management console (windows startmenu --> programs --> sap management console) start and stop the sap server (right click on “nsp” --> start / stop) default user: sap* (sap system super user) password: the one which you entered at installation of sap netweaver, e.g. admin123 usage of sap client a sap client should be used to get information about the sap system (functions, data, etc.) similarly to using e.g. mysql workbench to get information from a mysql database. sap gui (view layer) communicates with sap as abap (business logic layer). the application server communicates with the relational database (db layer). different clients are available for sap: sap gui windows sap gui java web browser external rfc-program for local development demos, sap gui windows is probably the best alternative. start sap gui windows by: clicking shortcut “windows start menu --> sap frontend --> sap logon” entering username and password clicking logon sap transactions in sap, you call sap programs via sap transaction codes. important transactions codes are for example: bapi: bapi explorer, view all sap bapi's se16: data browser, view/add table data se38: program editor here is a list of several other important transaction codes: http://www.sapdev.co.uk/tcodes/tcodes.htm installation of demo data the sap installation includes some demo data. as most people do not want to install “real” sap modules such as sap fi, sap crm or sap bi on their local system, this demo data is perfect for demos using talend’s sap connectors. to install the flight demo on a local sap system, you just have to open the abap editor (transaction: se38) and execute the program sapbc_data_generator. this program generates example data within the flight tables and does some further initializations. here is a good tutorial with more information and how to test the flight application: http://help.sap.com/saphelp_erp60_sp/helpdata/de/db/7c623cf568896be10000000a11405a/content.htm configuration of talend studio to use sap components talend’s sap components are already included in the studio. however, two further steps are required to be able to use them: copy sapjco3.dll to the directory c:/windows/system32 sap java connector jar must be added copy sapjco3.jar to the directory “talend/studio/lib/java” (re-) start talend studio check if sap library is added successfully open view “talend modules” (eclipse --> windows --> show view --> talend --> modules) sort by column “context” look for “tsap*” contexts and check if sapjco3.jar has status “installed” usage of sap components with talend studio this section describes how to use talend’s sap components and the sap wizard in general (using one specific example for calling a bapi). detailed descriptions of all sap components (for using bapis, rfcs, idocs, bw, etc.) are available in the documentation talend_components_rg_x.y.z.pdf at www.help.talend.com . connection to a sap system a connection to a sap system can be done “built-in” or via “metadata --> sap connections” (the latter only in enterprise version). using the latter has several advantages: reuse connection configuration quick check if connection to sap works wizards for retrieving functions from sap (instead of handwriting without wizard) quick test with test parameters if function works before finishing development lifecycle for a sap job development lifecycle for sap job: create connection (if not existing yet) right click on metadata --> sap connection create sap connection follow wizard sap jco version: 3 client: “001” userid: “sap*” password: “admin123” --> as you defined it while installation language: “en” hostname: “localhost” system number: “00” retrieve function (bapi / rfc) right click on created connection click on “retrieve sap function” enter search filter (e.g. bapi_fl*) click on “search” select and double click on your function (e.g bapi_flcust_getlist) you see all input, output and table parameters for this sap function click on “test in” --> here you see parameters in more detail: you now have to define which input and output parameters you want to use --> remove all other by selecting them and clicking “remove” button hint: if you do not remove an input parameter, you usually have to enter a value for it! select the output type - can be a single (single record), a table (list of records), or a structure output hint: difference between table and structure in sap: http://www.sapfans.com/forums/viewtopic.php?f=12&t=119794 if you want to do a quick test: enter values for input parameters (if there are any for your function call), then click “launch” button in this example, there is only an optional input parameter max_rows you should see data in the output fields in this example, you see the record with custname “sap ag” and street “neurottstr. 16” click “finish” button under “metadata --> sap connections --> “your connection” --> sap functions: there you can now see your function (in this example: bapi_flcust_getlist) create sap job drag&drop the created function into a job (without the wizard, you also can enter all data by hand) tsapinput component is proposed automatically. click ok to add it to your job go to “initialize input” and add parameter values in this example, there is just the parameter “max_rows” hint: the parameter value can be changed from a hardcoded value to a variable, of course (just click control space on your keyboard to get access to all available variables via code completion in your studio) go to the tsapinput component and add the desired output mapping (i.e. which values you want to process further with other components scroll to the bottom to “outputs” add the correct table / structure name (in this example: "customer_list") click on mapping (which is empty and has to be filled) click on “mapping”, then click on “…” add the wanted output columns of your sap function add the same names at the column “schema xpathqueries” (do not forget the double quotes here!) click “ok” button connect the tsapinput component to a tlogrowcomponent and synchronize the schema hint: always try out if this works before adding further logic to your job! run and test your job (you will see five rows logged (as you have configured max_rows = 5 that's it. now enjoy talend's sap components :-) best regards, kai wähner (twitter: @kaiwaehner) content from my blog: http://www.kai-waehner.de/blog/2013/03/03/sap-integration-with-talend-components-connectors-bapi-rfc-idoc-bw-soap/

March 4, 2013

by Kai Wähner

CORE

· 32,883 Views · 1 Like

JUnit testing of Spring MVC application: Testing the Service Layer

In continuation of my earlier blogs on Introduction to Spring MVC and Testing DAO layer in Spring MVC, in this blog I will demonstrate how to test Service layer in Spring MVC. The objective of this demo is 2 fold, to build the Service layer using TDD and increase the code coverage during JUnit testing of Service layer. For people in hurry, get the latest code from Github and run the below command mvn clean test -Dtest=com.example.bookstore.service.AccountServiceTest Since in my earlier blog, we have already tested the DAO layer, in this blog we only need to focus on testing service layer. We need to mock the DAO layer so that we can control the behavior in Service layer and cover various scenarios. Mockito is a good framework which is used to mock a method and return known data and assert that in the JUnit. As a first step we define the AccountServiceTestContextConfiguration class with AccountServiceTest class. If you notice there are 2 beans defined in that class and we marked the as a @Configuration which shows that it is a Spring Context class. In the JUnit test we @Autowired AccountService class. And AccountServiceImpl @Autowired the AccountRepository class. When creating the Bean in the configuration file we also stubbed the AccountRepository class using Mockito, @RunWith(SpringJUnit4ClassRunner.class) @ContextConfiguration public class AccountServiceTest { @Configuration static class AccountServiceTestContextConfiguration { @Bean public AccountService accountService() { return new AccountServiceImpl(); } @Bean public AccountRepository accountRepository() { return Mockito.mock(AccountRepository.class); } } //We Autowired the AccountService bean so that it is injected from the configuration @Autowired private AccountService accountService; @Autowired private AccountRepository accountRepository; During the setup of the JUnit we use Mockito mock findByUsername method to return a predefined account object as below @Before public void setup() { Account account = new AccountBuilder() { { address("Herve", "4650", "Rue de la gare", "1", null, "Belgium"); credentials("john", "secret"); name("John", "Doe"); } }.build(true); Mockito.when(accountRepository.findByUsername("john")).thenReturn(account); } Now we write the tests as below and test both the positive and negative scenarios, @Test(expected = AuthenticationException.class) public void testLoginFailure() throws AuthenticationException { accountService.login("john", "fail"); } @Test() public void testLoginSuccess() throws AuthenticationException { Account account = accountService.login("john", "secret"); assertEquals("John", account.getFirstName()); assertEquals("Doe", account.getLastName()); } } Finally we verify if the findByUsername method is called only once successfully as below in the teardown, @After public void verify() { Mockito.verify(accountRepository, VerificationModeFactory.times(1)).findByUsername(Mockito.anyString()); // This is allowed here: using container injected mocks Mockito.reset(accountRepository); } AccountService class looks as below, @Service @Transactional(readOnly = true) public class AccountServiceImpl implements AccountService { @Autowired private AccountRepository accountRepository; @Override public Account login(String username, String password) throws AuthenticationException { Account account = this.accountRepository.findByUsername(username, password); } else { throw new AuthenticationException("Wrong username/password", "invalid.username"); } return account; } } I hope this blog helped you. In my next blog, I will demo how to build a controller JUnit test. Reference: Pro Spring MVC: With Web Flow by by Marten Deinum, Koen Serneels

March 3, 2013

by Krishna Prasad

· 81,621 Views · 3 Likes

JUnit testing of Spring MVC application: Testing DAO layer

In continuation of my blog JUnit testing of Spring MVC application – Introduction, in this blog, I will show how to design and implement DAO layer for the Bookstore Spring MVC web application using Test Driven development. For people in hurry, get the latest code from Github and run the below command mvn clean test -Dtest=com.example.bookstore.repository.JpaBookRepositoryTest As a part of TDD, Write a basic CRUD (create, read, update, delete) operations on a Book DAO class com.example.bookstore.repository.JpaBookRepository. Don’t have the database wiring yet in this DAO class. Once we build the JUnit tests, we use JPA as a persistence layer. We also use H2 as a inmemory database for testing purpose. Create Book POJO class Create the JUnit test as below, public class JpaBookRepositoryTest { @Test public void testFindById() { Book book = bookRepository.findById(this.book.getId()); assertEquals(this.book.getAuthor(), book.getAuthor()); assertEquals(this.book.getDescription(), book.getDescription()); assertEquals(this.book.getIsbn(), book.getIsbn()); } @Test public void testFindByCategory() { List books = bookRepository.findByCategory(category); assertEquals(1, books.size()); for (Book book : books) { assertEquals(this.book.getCategory().getId(), category.getId()); assertEquals(this.book.getAuthor(), book.getAuthor()); assertEquals(this.book.getDescription(), book.getDescription()); assertEquals(this.book.getIsbn(), book.getIsbn()); } } @Test @Rollback(true) public void testStoreBook() { Book book = new BookBuilder() { { description("Something"); author("JohnDoe"); title("John Doe's life"); isbn("1234567890123"); category(category); } }.build(); bookRepository.storeBook(book); Book book1 = bookRepository.findById(book.getId()); assertEquals(book1.getAuthor(), book.getAuthor()); assertEquals(book1.getDescription(), book.getDescription()); assertEquals(book1.getIsbn(), book.getIsbn()); } } If you notice since the JpaBookRepository is only a skeleton class without implementation, all the tests will fail. As a next step, we need to create a Configuration and wire a datasource, and for the test purpose we will be using H2 database. And we also need to wire this back to JUnit test as below, @Configuration public class InfrastructureContextConfiguration { @Autowired private DataSource dataSource; //some more configurations.. @Bean public DataSource dataSource() { EmbeddedDatabaseBuilder builder = new EmbeddedDatabaseBuilder(); builder.setType(EmbeddedDatabaseType.H2); return builder.build(); } } //JUnit test wiring is as below @RunWith(SpringJUnit4ClassRunner.class) @ContextConfiguration(classes = { InfrastructureContextConfiguration.class, TestDataContextConfiguration.class }) @Transactional public class JpaBookRepositoryTest { //the test methods } Next step is to setup and teardown sample data in the JUnit test case as below, public class JpaBookRepositoryTest { @PersistenceContext private EntityManager entityManager; private Book book; private Category category; @Before public void setupData() { EntityBuilderManager.setEntityManager(entityManager); category = new CategoryBuilder() { { name("Evolution"); } }.build(); book = new BookBuilder() { { description("Richard Dawkins' brilliant reformulation of the theory of natural selection"); author("Richard Dawkins"); title("The Selfish Gene: 30th Anniversary Edition"); isbn("9780199291151"); category(category); } }.build(); } @After public void tearDown() { EntityBuilderManager.clearEntityManager(); } } Once we do the wiring, we need to implement the com.example.bookstore.repository.JpaBookRepository and use JPA to do the CRUD on the database and run the tests. The tests will succeed. Finally if you run Cobertura for this example from STS, we will get over 90% of line coverage for com.example.bookstore.repository.JpaBookRepository. In case you want to try few exercises you can implement repository for Account and User. I hope this blog helped you. In my next blog I will talk about Mochito and Implementing the Service layer.

March 1, 2013

by Krishna Prasad

· 80,257 Views

How Expensive is a Method Call in Java?

We have all been there. Looking at the poorly designed code while listening to the author’s explanations about how one should never sacrifice performance over design. And you just cannot convince the author to get rid of his 500-line methods because chaining method calls would destroy the performance. Well, it might have been true in 1996 or so. But since then JVM has evolved to be an amazing piece of software. One way to find out about it is to start looking more deeply into optimizations carried out by the virtual machine. The arsenal of techniques applied by the JVM is quite extensive, but lets look into one of them in more details. Namely method inlining . It is easiest to explain via the following sample: int sum(int a, int b, int c, int d) { return sum(sum(a, b),sum(c, d)); } int sum(int a, int b) { return a + b; } When this code is run, the JVM will figure out that it can replace it with a more effective, so called “inlined” code: int sum(int a, int b, int c, int d) { return a + b + c + d; } You have to pay attention that this optimization is done by the virtual machine and not by the compiler. It is not transparent at the first place why this decision was made. After all – if you look at the sample code above – why postpone optimization when compilation can produce more efficient bytecode? But considering also other not-so obvious cases, JVM is the best place to carry out the optimization: JVM is equipped with runtime data besides static analysis. During runtime JVM can make better decisions based on what methods are executed most often, what loads are redundant, when is it safe to use copy propagation, etc. JVM has got information about the underlying architecture – number of cores, heap size and configuration and can thus make the best selection based on this information. But let us see those assumptions in practice. I have created a small test application which uses several different ways to add together 1024 integers. A relatively reasonable one, where the implementation just iterates over the array containing 1024 integers and sums the result together. This implementation is available in InlineSummarizer.java . Recursion based divide-and-conquer approach. I take the original 1024 – element array and recursively divide it into halves – the first recursion depth thus gives me two 512-element arrays, the second depth has four 256-element arrays and so forth. In order to sum together all the 1024 elements I introduce 1023 additional method invocations. This implementation is attached as RecursiveSummarizer.java . Naive divide-and-conquer approach. This one also divides the original 1024-element array, but via calling additional instance methods on the separated halves – namely I nest sum512(), sum256(), sum128(), …, sum2() calls until I have summarized all the elements. As with recursion, I introduce 1023 additional method invocations in the source code . And I have a test class to run all those samples. The first results are from unoptimized code: As seen from the above, the inlined code is the fastest. And the ones where we have introduced 1023 additional method invocations are slower by ~25,000ns. But this image has to be interpreted with a caveat – it is a snapshot from the runs where JIT has not yet fully optimized the code. In my mid-2010 MB Pro it took between 200 and 3000 runs depending on the implementation. The more realistic results are below. I have ran all the summarizer implementations for more than 1,000,000 times and discarded the runs where JIT has not yet managed to perform it’s magic. We can see that even though inlined code still performed best, the iterative approach also flew at a decent speed. But recursion is notably different – when iterative approach close in with just 20% overhead, RecursiveSummarizer takes 340% of the time the inlined code needs to complete. Apparently this is something one should be aware of – when you use recursion, the JVM is helpless and cannot inline method calls. So be aware of this limitation when using recursion. Recursion aside – method overheads are close to being non-existent. Just 205 ns difference between having 1023 additional method invocations in your source code. Remember, those were nanoseconds (10^-9 s) over there that we used for measurement. So thanks to JIT we can safely neglect most of the overhead introduced by method invocations. The next time when your coworker is hiding his lousy design decisions behind the statement that popping through a call stack is not efficient, let him go through a small JIT crash course first. And if you wish to be well-equipped to block his future absurd statements, subscribe to either our RSS or Twitter feed and we are glad to provide you future case studies. Full disclosure: the inspiration for the test case used in this article was triggered by Tomasz Nurkiewicz blog post .

February 27, 2013

by Nikita Salnikov-Tarnovski

· 13,031 Views

JDBC: What Resources You Have to Close and When?

I was never sure what resources in JDBC must be explicitely closed and wasn’t able to find it anywhere explained. Finally my good colleague, Magne Mære, has explained it to me: In JDBC there are several kinds of resources that ideally should be closed after use. Even though every Statement and PreparedStatement is specified to be implicitly closed when the Connection object is closed, you can’t be guaranteed when (or if) this happens, especially if it’s used with connection pooling. You should explicitly close your Statement and PreparedStatement objects to be sure. ResultSet objects might also be an issue, but as they are guaranteed to be closed when the corresponding Statement/PreparedStatement object is closed, you can usually disregard it. Summary: Always close PreparedStatement/Statement and Connection. (Of course, with Java 7+ you’d use the try-with-resources idiom to make it happen automatically.) PS: I believe that the close() method on pooled connections doesn’t actually close them but just returns them to the pool. A request to my dear users: References to any good resources would be appreciate.

February 26, 2013

by Jakub Holý

· 27,146 Views

Text Processing, Part 2: Oh, Inverted Index

This is the second part of my text processing series. In this blog, we'll look into how text documents can be stored in a form that can be easily retrieved by a query. I'll used the popular open source Apache Lucene index for illustration. There are two main processing flow in the system ... Document indexing: Given a document, add it into the index Document retrieval: Given a query, retrieve the most relevant documents from the index. The following diagram illustrate how this is done in Lucene. Index Structure Both documents and query is represented as a bag of words. In Apache Lucene, "Document" is the basic unit for storage and retrieval. A "Document" contains multiple "Fields" (also call zones). Each "Field" contains multiple "Terms" (equivalent to words). To control how the document will be indexed across its containing fields, a Field can be declared in multiple ways to specified whether it should be analyzed (a pre-processing step during index), indexed (participate in the index) or stored (in case it needs to be returned in query result). Keyword (Not analyzed, Indexed, Stored) Unindexed (Not analyzed, Not indexed, Stored) Unstored (Analyzed, Indexed, Not stored) Text (Analyzed, Indexed, Stored) The inverted index is a core data structure of the storage. It is organized as an inverted manner from terms to the list of documents (which contain the term). The list (known as posting list) is ordered by a global ordering (typically by document id). To enable faster retrieval, the list is not just a single list but a hierarchy of skip lists. For simplicity, we ignore the skip list in subsequent discussion. This data structure is illustration below based on Lucene's implementation. It is stored on disk as segment files which will be brought to memory during the processing. The above diagram only shows the inverted index. The whole index contain an additional forward index as follows. Document indexing Document in its raw form is extracted from a data adaptor. (this can be making an Web API to retrieve some text output, or crawl a web page, or receiving an HTTP document upload). This can be done in a batch or online manner. When the index processing start, it parses each raw document and analyze its text content. The typical steps includes ... Tokenize the document (breakdown into words) Lowercase each word (to make it non-case-sensitive, but need to be careful with names or abbreviations) Remove stop words (take out high frequency words like "the", "a", but need to careful with phrases) Stemming (normalize different form of the same word, e.g. reduce "run", "running", "ran" into "run") Synonym handling. This can be done in two ways. Either expand the term to include its synonyms (ie: if the term is "huge", add "gigantic" and "big"), or reduce the term to a normalized synonym (ie: if the term is "gigantic" or "huge", change it to "big") At this point, the document is composed with multiple terms. doc = [term1, term2 ...]. Optionally, terms can be further combined into n-grams. After that we count the term frequency of this document. For example, in a bi-gram expansion, the document will become ... doc1 -> {term1: 5, term2: 8, term3: 4, term1_2: 3, term2_3:1} We may also compute a "static score" based on some measure of quality of the document. After that, we insert the document into the posting list (if it exist, otherwise create a new posting list) for each terms (all n-grams), this will create the inverted list structure as shown in previous diagram. There is a boost factor that can be set to the document or field. The boosting factor effectively multiply the term frequency which effectively affecting the importance of the document or field. Document can be added to the index in one of the following ways; inserted, modified and deleted. Typically the document will first added to the memory buffer, which is organized as an inverted index in RAM. When this is a document insertion, it goes through the normal indexing process (as I described above) to analyze the document and build an inverted list in RAM. When this is a document deletion (the client request only contains the doc id), it fetches the forward index to extract the document content, then goes through the normal indexing process to analyze the document and build the inverted list. But in this case the doc object in the inverted list is labeled as "deleted". When this is a document update (the client request contains the modified document), it is handled as a deletion followed by an insertion, which means the system first fetch the old document from the forward index to build an inverted list with nodes marked "deleted", and then build a new inverted list from the modified document. (e.g. If doc1 = "A B" is update to "A C", then the posting list will be {A:doc1(deleted) -> doc1, B:doc1(deleted), C:doc1}. After collapsing A, the posting list will be {A:doc1, B:doc1(deleted), C:doc1} As more and more document are inserted into the memory buffer, it will become full and will be flushed to a segment file on disk. In the background, when M segments files have been accumulated, Lucene merges them into bigger segment files. Notice that the size of segment files at each level is exponentially increased (M, M^2, M^3). This maintains the number of segment files that need to be search per query to be at the O(logN) complexity where N is the number of documents in the index. Lucene also provide an explicit "optimize" call that merges all the segment files into one. Here lets detail a bit on the merging process, since the posting list is already vertically ordered by terms and horizontally ordered by doc id, merging two segment files S1, S2 is basically as follows Walk the posting list from both S1 and S2 together in sorted term order. For those non-common terms (term that appears in one of S1 or S2 but not both), write out the posting list to a new segment S3. Until we find a common term T, we merge the corresponding posting list from these 2 segments. Since both list are sorted by doc id, we just walk down both posting list to write out the doc object to a new posting list. When both posting lists have the same doc (which is the case when the document is updated or deleted), we pick the latest doc based on time order. Finally, the doc frequency of each posting list (of the corresponding term) will be computed. Document retrieval Consider a document is a vector (each term as the separated dimension and the corresponding value is the tf-idf value) and the query is also a vector. The document retrieval problem can be defined as finding the top-k most similar document that match a query, where similarity is defined as the dot-product or cosine distance between the document vector and the query vector. tf-idf is a normalized frequency. TF (term frequency) represents how many time the term appears in the document (usually a compression function such as square root or logarithm is applied). IDF is the inverse of document frequency which is used to discount the significance if that term appears in many other documents. There are many variants of TF-IDF but generally it reflects the strength of association of the document (or query) with each term. Given a query Q containing terms [t1, t2], here is how we fetch the corresponding documents. A common approach is the "document at a time approach" where we traverse the posting list of t1, t2 concurrently (as opposed to the "term at a time" approach where we traverse the whole posting list of t1 before we start the posting list of t2). The traversal process is described as follows ... For each term t1, t2 in query, we identify all the corresponding posting lists. We walk each posting list concurrently to return a sequence of documents (ordered by doc id). Notice that each return document contains at least one term but can also also contain multiple terms. We compute the dynamic score which is dot product of the query to document vector. Notice that we typically don't concern the TF/IDF of the query (which is short and we don't care the frequency of each term). Therefore we can just compute the sum up all the TF score of the posting list that has a match term after dividing the IDF score (at the head of each posting list). Lucene also support query level boosting where a boost factor can be attached to the query terms. The boost factor will multiply the term frequency correspondingly. We also look up the static score which is purely based on the document (but not the query). The total score is a linear combination of static and dynamic score. Although the score we used in above calculation is based on computing the cosine distance between the query and document, we are not restricted to that. We can plug in any similarity function that make sense to the domain. (e.g. we can use machine learning to train a model to score the similarity between a query and a document). After we compute a total score, we insert the document into a heap data structure where the topK scored document is maintained. Here the whole posting list will be traversed. In case of the posting list is very long, the response time latency will be long. Is there a way that we don't have to traverse the whole list and still be able to find the approximate top K documents ? There are a couple strategies we can consider. Static Score Posting Order: Notice that the posting list is sorted based on a global order, this global ordering provide a monotonic increasing document id during the traversal that is important to support the "document at a time" traversal because it is impossible to visit the same document again. This global ordering, however, can be quite arbitrary and doesn't have to be the document id. So we can pick the order to be based on the static score (e.g. quality indicator of the document) which is global. The idea is that we traverse the posting list in decreasing magnitude of static score, so we are more likely to visit the document with the higher total score (static + dynamic score). Cut frequent terms: We do not traverse the posting list whose term has a low IDF value (ie: the term appears in many documents and therefore the posting list tends to be long). This way we avoid to traverse the long posting list. TopR list: For each posting list, we create an extra posting list which contains the top R documents who has the highest TF (term frequency) in the original list. When we perform the search, we perform our search in this topR list instead of the original posting list. Since we have multiple inverted index (in memory buffer as well as the segment files at different levels), we need to combine the result them. If termX appears in both segmentA and segmentB, then the fresher version will be picked. The fresher version is determine as follows; the segment with a lower level (smaller size) will be considered more fresh. If the two segment files are at the same level, then the one with a higher number is more fresh. On the other hand, the IDF value will be the sum of the corresponding IDF of each posting list in the segment file (the value will be slightly off if the same document has been updated, but such discrepancy is negligible). However, the processing of consolidating multiple segment files incur processing overhead in document retrieval. Lucene provide an explicit "optimize" call to merge all segment files into one single file so there is no need to look at multiple segment files during document retrieval. Distributed Index For large corpus (like the web documents), the index is typically distributed across multiple machines. There are two models of distribution: Term partitioning and Document partitioning. In document partitioning, documents are randomly spread across different partitions where the index is built. In term partitioning, the terms are spread across different partitions. We'll discuss document partitioning as it is more commonly used. Distributed index is provider by other technologies that is built on Lucene, such as ElasticSearch. A typical setting is as follows ... In this setting, machines are organized as columns and rows. Each column represent a partition of documents while each row represent a replica of the whole corpus. During the document indexing, first a row of the machines is randomly selected and will be allocated for building the index. When a new document crawled, a column machine from the selected row is randomly picked to host the document. The document will be sent to this machine where the index is build. The updated index will be later propagated to the other rows of replicas. During the document retrieval, first a row of replica machines is selected. The client query will then be broadcast to every column machine of the selected row. Each machine will perform the search in its local index and return the TopM elements to the query processor which will consolidate the results before sending back to client. Notice that K/P < M < K, where K is the TopK documents the client expects and P is the number of columns of machines. Notice that M is a parameter that need to be tuned. One caveat of this distributed index is that as the posting list is split horizontally across partitions, we lost the global view of the IDF value without which the machine is unable to calculate the TF-IDF score. There are two ways to mitigate that ... Do nothing: here we assume the document are evenly spread across different partitions so the local IDF represents a good ratio of the actual IDF. Extra round trip: In the first round, query is broadcasted to every column which returns its local IDF. The query processor will collected all IDF response and compute the sum of the IDF. In the second round, it broadcast the query along with the IDF sum to each column of machines, which will compute the local score based on the IDF sum.

February 26, 2013

by Ricky Ho

· 9,220 Views

Monitoring an IBM JVM with VisualVM

JDK6 update 7 and onward include a tool called VisualVM. VisualVM is a visual tool with monitoring and profiling capabilities for the JVM. With VisualVM you can: Monitor heap usage Monitor CPU usage Monitor Threads Initiate garbage collections Profile CPU and memory And more… Although VisualVM is distributed with the Oracle JDK, it can also be used to monitor IBM JVM’s. VisualVM is not able to connect to the IBM JVM locally. JMX must be used instead. To enable JMX monitoring on the IBM JVM open the WebSphere administrative console and: navigate to: Server -> Server Types -> WebSphere application servers ->[SERVER_NAME] Expand Java and Process Management and click Process definition Click Java Virtual Machine In the Generic JVM arguments field append the following properties: -Djavax.management.builder.initial= -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=1099 Restart the server To start monitoring the JVM with VisualVM start VisualVM by navigating to [JDK_HOME]\bin and start jvisualvm.exe (please note that when VisualVM is downloaded as a separate package the executable is called visualvm.exe instead). In VisualVM click File -> Add JMX Connection. Specify localhost:1099 in the connection field and click OK. If everything went OK, you should see the localhost:1099 connection under the Local node in the tree on the left. Double-click this node to start monitoring. See the following screenshot for an example: When using a JMX connection to monitor the JVM please be aware that not all functionality can be used compared to monitoring a local JVM. Profiling memory is for example not possible. The above configuration was tested on: Windows 7 64-bit IBM JDK1.6 64 bit WebSphere Application Server version 7.0 Note: Before JDK6 update 7, VisualVM can also be downloaded separately from http://visualvm.java.net/download.html Note: To actually test if port 1099 is listening for connections use (on Windows) the netstat –a command and check wether the port is present and listening.

February 21, 2013

by Jamie Craane

· 30,864 Views · 1 Like

Spring-Test-MVC Junit Testing Spring Security Layer with Method Level Security

For people in hurry get the code from Github. In continuation of my earlier blog on spring-test-mvc junit testing Spring Security layer with InMemoryDaoImpl, in this blog I will discuss how to use achieve method level access control. Please follow the steps in this blog to setup spring-test-mvc and run the below test case. mvn test -Dtest=com.example.springsecurity.web.controllers.SecurityControllerTest The JUnit test case looks as below, @RunWith(SpringJUnit4ClassRunner.class) @ContextConfiguration(loader = WebContextLoader.class, value = { "classpath:/META-INF/spring/services.xml", "classpath:/META-INF/spring/security.xml", "classpath:/META-INF/spring/mvc-config.xml" }) public class SecurityControllerTest { @Autowired CalendarService calendarService; @Test public void testMyEvents() throws Exception { Authentication auth = new UsernamePasswordAuthenticationToken("[email protected]", "user1"); SecurityContext securityContext = SecurityContextHolder.getContext(); securityContext.setAuthentication(auth); calendarService.findForUser(0); SecurityContextHolder.clearContext(); } @Test(expected = AuthenticationCredentialsNotFoundException.class) public void testForbiddenEvents() throws Exception { calendarService.findForUser(0); } } @Test(expected=AccessDeniedException.class) public void testWrongUserEvents() throws Exception { Authentication auth = new UsernamePasswordAuthenticationToken("[email protected]", "user2"); SecurityContext securityContext = SecurityContextHolder.getContext(); securityContext.setAuthentication(auth); calendarService.findForUser(0); SecurityContextHolder.clearContext(); } If you notice, if the user did not login or if the user is trying to access another users information it will throw an exception. The interface access control is as below, public interface CalendarService { @PreAuthorize("hasRole('ROLE_ADMIN') or principal.id == #userId") List findForUser(int userId); } The PreAuthorize only works on interface so that any implementation that implements this interface has this access control. I hope this blog helps you.

February 21, 2013

by Krishna Prasad

· 23,486 Views

Styling JavaFX Pie Chart with CSS

JavaFX provides certain colors by default when rendering charts. There are situations, however, when one wants to customize these colors. In this blog post I look at changing the colors of a JavaFX pie chart using an example I intend to include in my presentation this afternoon at RMOUG Training Days 2013. Some Java-based charting APIs provided Java methods to set colors. JavaFX, born in the days of HTML5 prevalence, instead uses Cascading Style Sheets (CSS) to allow developers to adjust colors, symbols, placement, alignment and other stylistic issues used in their charts. I demonstrate using CSS to change colors here. In this post, I will look at two code samples demonstrating simple JavaFX applications that render pie charts based on data from Oracle's sample 'hr' schema. The first example does not specify colors and so uses JavaFX's default colors for pie slices and for the legend background. That example is shown next. EmployeesPerDepartmentPieChart (Default JavaFX Styling) package rmoug.td2013.dustin.examples; import javafx.application.Application; import javafx.scene.Scene; import javafx.scene.chart.PieChart; import javafx.scene.layout.StackPane; import javafx.stage.Stage; /** * Simple JavaFX application that generates a JavaFX-based Pie Chart representing * the number of employees per department. * * @author Dustin */ public class EmployeesPerDepartmentPieChart extends Application { final DbAccess databaseAccess = DbAccess.newInstance(); @Override public void start(final Stage stage) throws Exception { final PieChart pieChart = new PieChart( ChartMaker.createPieChartDataForNumberEmployeesPerDepartment( this.databaseAccess.getNumberOfEmployeesPerDepartmentName())); pieChart.setTitle("Number of Employees per Department"); stage.setTitle("Employees Per Department"); final StackPane root = new StackPane(); root.getChildren().add(pieChart); final Scene scene = new Scene(root, 800 ,500); stage.setScene(scene); stage.show(); } public static void main(final String[] arguments) { launch(arguments); } } When the above simple application is executed, the output shown in the next screen snapshot appears. I am now going to adapt the above example to use a custom "theme" of blue-inspired pie slices with a brown background on the legend. Only one line is needed in the Java code to include the CSS file that has the stylistic specifics for the chart. In this case, I added several more lines to catch and print out any exception that might occur while trying to load the CSS file. With this approach, any problems loading the CSS file will lead simply to output to standard error stating the problem and the application will run with its normal default colors. EmployeesPerDepartmentPieChartWithCssStyling (Customized CSS Styling) package rmoug.td2013.dustin.examples; import javafx.application.Application; import javafx.scene.Scene; import javafx.scene.chart.PieChart; import javafx.scene.layout.StackPane; import javafx.stage.Stage; /** * Simple JavaFX application that generates a JavaFX-based Pie Chart representing * the number of employees per department and using style based on that provided * in CSS stylesheet chart.css. * * @author Dustin */ public class EmployeesPerDepartmentPieChartWithCssStyling extends Application { final DbAccess databaseAccess = DbAccess.newInstance(); @Override public void start(final Stage stage) throws Exception { final PieChart pieChart = new PieChart( ChartMaker.createPieChartDataForNumberEmployeesPerDepartment( this.databaseAccess.getNumberOfEmployeesPerDepartmentName())); pieChart.setTitle("Number of Employees per Department"); stage.setTitle("Employees Per Department"); final StackPane root = new StackPane(); root.getChildren().add(pieChart); final Scene scene = new Scene(root, 800 ,500); try { scene.getStylesheets().add("chart.css"); } catch (Exception ex) { System.err.println("Cannot acquire stylesheet: " + ex.toString()); } stage.setScene(scene); stage.show(); } public static void main(final String[] arguments) { launch(arguments); } } The chart.css file is shown next: chart.css /* Find more details on JavaFX supported named colors at http://docs.oracle.com/javafx/2/api/javafx/scene/doc-files/cssref.html#typecolor */ /* Colors of JavaFX pie chart slices. */ .data0.chart-pie { -fx-pie-color: turquoise; } .data1.chart-pie { -fx-pie-color: aquamarine; } .data2.chart-pie { -fx-pie-color: cornflowerblue; } .data3.chart-pie { -fx-pie-color: blue; } .data4.chart-pie { -fx-pie-color: cadetblue; } .data5.chart-pie { -fx-pie-color: navy; } .data6.chart-pie { -fx-pie-color: deepskyblue; } .data7.chart-pie { -fx-pie-color: cyan; } .data8.chart-pie { -fx-pie-color: steelblue; } .data9.chart-pie { -fx-pie-color: teal; } .data10.chart-pie { -fx-pie-color: royalblue; } .data11.chart-pie { -fx-pie-color: dodgerblue; } /* Pie Chart legend background color and stroke. */ .chart-legend { -fx-background-color: sienna; } Running this CSS-styled example leads to output as shown in the next screen snapshot. The slices are different shades of blue and the legend's background is "sienna." Note that while I used JavaFX "named colors," I could have also used "#0000ff" for blue, for example. I did not show the code here for my convenience classes ChartMaker and DbAccess. The latter simply retrieves the data for the charts from the Oracle database schema via JDBC and the former converts that data into the Observable collections appropriate for the PieChart(ObservableList) constructor. It is important to note here that, as Andres Almiray has pointed out, it is not normally appropriate to execute long-running processes from the main JavaFX UI thread (AKA JavaFX Application Thread) as I've done in this and other other blog post examples. I can get away with it in these posts because the examples are simple, the database retrieval is quick, and there is not much more to the chart rendering application than that rendering so it is difficult to observe any "hanging." In a future blog post, I intend to look at the better way of handling the database access (or any long-running action) using the JavaFX javafx.concurrent package (which is well already well described in Concurrency in JavaFX). JavaFX allows developers to control much more than simply chart colors with CSS. Two very useful resources detailing what can be done to style JavaFX charts with CSS are the Using JavaFX Charts section Styling Charts with CSS and the JavaFX CSS Reference Guide. CSS is becoming increasingly popular as an approach to styling web and mobile applications. By supporting CSS styling in JavaFX, the same styles can easily be applied to JavaFX apps as the HTML-based applications they might coexist with.

February 18, 2013

by Dustin Marx

· 6,913 Views

Introduction to JCache JSR 107

Resin has supported caching, session replication (another form of caching), and http proxy caching in cluster environments for over ten years. When you use Resin caching, you are using the same platform that has the speed and scalability of custom services written in C like NginX with the usability of Java, and the industry platform Java EE. JCache JSR 107 is a distributed cache that has a similar interface to the HashMap that you know and love. To be more specific, the Cache object in JCache looks like a java.util.ConncurrentHashMap. In addition, JCache JSR 107 defines integration with CDI (as well as Spring and Guice). You can decorate services with interceptors that apply caching to the services just by defining annotations. Resin 4 has support for JCache, and JCache support is required for Java EE 7. Let's look at a small example to see how easy is to get started with JCache. package hello.world; import javax.cache.Cache; import javax.cache.CacheBuilder; import javax.cache.CacheManager; import javax.cache.Caching; ... @WebServlet("/HelloServlet") public class HelloServlet extends HttpServlet { Cache cache; public Cache cache() { if (cache == null) { //building a cache CacheManager manager = Caching.getCacheManager("cacheManagerHello"); CacheBuilder builder = manager.createCacheBuilder("a"); cache = builder.build(); } return cache; } protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { response.setContentType("text/html"); response.getWriter().append(" "); String helloMessage = cache().get("hello message"); if (helloMessage == null) { helloMessage = new StringBuilder(20) .append("Hello World ! ") .append(System.currentTimeMillis()).toString(); cache().put("hello message", helloMessage); // <-------------- putting results in the cache } response.getWriter().append(helloMessage); response.getWriter().append(" "); } } The above works out fairly well, but what if we want to periodically change the helloMessage. Let's say we get 2,000 requests a second, but every 10 seconds or so we would like to regenerate the helloMessage. The message might be: Hello World ! 1358979745996 Later we would want it to change. If we wanted it to change every 10 seconds after it was last accessed, we would do this: cache = builder.setExpiry(ExpiryType.ACCESSED, new Duration(TimeUnit.SECONDS, 10)).build(); For this example, we want to change it every 10 seconds after is was last modified. We would set up the timeout on the creation as follows: cache = builder.setExpiry(ExpiryType.MODIFIED, new Duration(TimeUnit.SECONDS, 10)).build(); This would go right in the cache method we defined earlier. public Cache cache() { if (cache == null) { CacheManager manager = Caching.getCacheManager("cacheManagerHello"); CacheBuilder builder = manager.createCacheBuilder("b"); cache = builder.setExpiry(ExpiryType.MODIFIED, new Duration(TimeUnit.SECONDS, 10)).build(); } return cache; } Resin's JCache implementation is built on top Resin distributed cache architecture. You get replication, and data redundancy built in. Bill Digman is a Java EE / Servlet enthusiast and Open Source enthusiast who loves working with Caucho's Resin Servlet Container, a Java EE Web Profile Servlet Container. Caucho's Resin OpenSource Servlet Container Java EE Web Profile Servlet Container Caucho's Resin 4.0 JCache blog post

February 13, 2013

by Bill Digman

· 48,916 Views · 1 Like

The Heroes of Java: Marcus Hirt

Lets continue the "Heroes of Java" series. Today's interview has been planned nearly since the launch of the series and I knew that it would be a tough one to get. I know Marcus since a few years now and he is always busy providing the best diagnostic tools to Java developers. Thanks for finally joining, Marcus! It is a pleasure to have you here! Marcus Hirt is one of the founders of Appeal Virtual Machines, the company that created the JRockit JVM. He is currently working as Team Lead for the Java Mission Control team. In his spare time he enjoys coding on his many pet projects, composing music, and scuba diving. Marcus has contributed JRockit related articles, whitepapers, tutorials, and webinars to the JRockit community, and has been an appreciated speaker at various conferences, such as Oracle Open World and Java One. He is also one of the two authors behind a popular book about JVM technology (link to my review). General Part Who are you? I am a computer aficionado with a strikingly unmodern and lengthy romance with typed languages, profiling and diagnostics. I have three kids and a lovely wife, so right now there isn't much spare time to go around. When there was, I used to compose music, play the piano, scuba dive and do martial arts. Your official job title at your company? Consulting Member of Technical Staff Do you care about it? I care about being appreciated for my work. The title itself means nothing. Do you speak foreign languages? Which ones? Swedish is my native tongue and my preferred language for anything that is not computer related. That said, since most of the terminology in our business is in English, I actually prefer English when talking shop. I am half Swiss, and I did spend some time at Real Gymnasium Kirchenfeld in Bern. I haven't used my German since then though, so it is beyond rusty. How long is your daily "bootstrap" process? (Coffee, news, email) If you don't count email, it is almost non-existent. However, in my role as a team leader, email is chewing up a good portion of the morning these days. Thanks to the excellent mass transit system in Stockholm, that is usually taken care of before I arrive at the office. At least during the winters. During the summers I usually drive my motorcycle to work. Twitter You have a twitter handle? Why? I indiscriminately sign up for all social services. Then I find that I don't use most of them. Twitter is a bit of a exception, since I do tend to read what others write. When I do tweet it is mostly about new obscure and/or unsupported features in the Hotspot JDK. My twitter handle is @hirt. Whom are you following in general? I mostly follow people that I know and respect in the Java community. Do you have a personal "policy" for twitter? I try to avoid it at work. Does your company restrict or encourage you with your twitter usage? Oracle has neither actively encouraged nor restricted my twitter usage. The only time I can recall a company actively encouraging me to engage in some official social capacity was some years ago, when BEA tried to encourage people to blog. I’ve since moved away from the official company blog, because of a tooling issue. Work What's your daily development setup? (OS/IDE/VC/other Tools) Since the first target platform for JRockit was Windows, I've stuck with Windows at work. I am using Windows 7/Eclipse/Perforce and Visual Studio. At home I am using Mac OS X/Eclipse/Git&Perforce and XCode. Which is the tool providing most productivity to your work? These days: Eclipse. No doubt. Your preferred way of interacting with co-workers? Face to face for longer discussions. IM is good for smaller things, since you can choose when to handle the interrupt, whilst still being fairly interactive. What's your favorite way of managing your todo's? Pen and paper. Stone age, right? If you could make a wish for a job at your favorite company: What would that be? Whichever would give me the resources to attack some of the high impact development projects on my "want to do" list. Oracle is currently quite a good place to be. Java You're programming in Java. Why? To be honest, I am not exclusively programming in Java. When I do, it is because it is one of the programming environments in which I find myself to be the most productive. It may not be the least verbose or most elegant of languages, but the tooling and debugging capabilities are top notch. Not to mention that some intrinsic features of the language itself, such as the memory management, makes it easier to write error free code. Also, since there has been competition around the JVM for more than a decade, the JVMs for Java are really quite sophisticated. Not to mention fast. What's least fun with Java? It is unnecessarily verbose (more type inference please), type erasure (ever sent in a class to your generic type to have a chance of knowing what runtime type it is?), and any and all things that makes the illusion of an all powerful runtime break down. In a perfect world, a Java programmer should not have to worry about the details of the JVM configuration. For instance, why should I need to estimate how much space I will need for constants and class metadata (perm gen)? Thankfully there is work being done on this as we speak; the perm gen is scheduled for removal in JDK 8. I think there is a lot to be said for improving the usability of the JVM. If you could change one thing with Java, what would that be? There are many who want Java to be everything to everyone. I don't subscribe to that view. Instead, let's make it easy to run whatever language you want on the JVM. That said, if I could change something about the implementation, I would probably want a thread local garbage collector. One with insanely good heuristics as to when to back off and stop handling an object thread locally. Then there are some other things, but since I may start working on them soon, I would rather keep them to myself for now. :) What's your personal favorite in dynamic languages? Ruby is cute. I especially like the implementation on the JVM (JRuby). Which programming technique has moved you forwards most and why? When I first started my education at the Royal Institute of Technology, I had already programmed in various languages, such as Pascal, C and assembly. I really thought I had things figured out, until I came to the first computer science course. There I got confronted with SICP and Scheme. That was IMHO a genius move by the computer science department. All the cocky kids with prior experience, such as myself, got a rich serving of humble-pie. Functional programming taught me very elegant ways of expressing myself. Kudos to MIT and Sussman et al. What was the biggest project you've ever worked on? JRockit and JRockit Mission Control. I was one of the founders of Appeal (Appeal Virtual Machines & Appeal Software Solutions), a big project in itself. Which was the worst programming mistake you did? Well, maybe not strictly a programming mistake, but one of the worst red-face issues I've done is when a JRockit performance counter was slightly misspelled - the 'o' was accidentally dropped from *count. The bug report stated that "the customer was not amused". I must admit I was though. Another fun, deliberate, "mistake" was when I added the following three lines to one of the Mission Control property files: ------ # :) We just felt that we needed this one translated... # /The MC team Template_DEFAULT_TEMPLATE_NAME=All your base are belong to us! ------ I was hoping to get a cynical remark back from the translation team, but they just translated it the best they could. Heh. Finally, one of the worst programming mistakes in recent history was in a small start-up project. A hash code calculation error caused some subtle errors to one of many data points in a running production system. I finally solved the problem when I got fiber installed at home and got bold. In desperation I started a node with jdwp turned on, and I then proceeded to set break points and evaluate code remotely over an ssh tunnel. The latency was so low that it almost felt like a local debugging session. Crazy, but you gotta love Java for providing you with options. ;)

February 13, 2013

by Markus Eisele

· 4,314 Views

Java 8: From PermGen to Metaspace

As you may be aware, the JDK 8 Early Access is now available for download. This allows Java developers to experiment with some of the new language and runtime features of Java 8. One of these features is the complete removal of the Permanent Generation (PermGen) space which has been announced by Oracle since the release of JDK 7. Interned strings, for example, have already been removed from the PermGen space since JDK 7. The JDK 8 release finalizes its decommissioning. This article will share the information that we found so far on the PermGen successor: Metaspace. We will also compare the runtime behavior of the HotSpot 1.7 vs. HotSpot 1.8 (b75) when executing a Java program “leaking” class metadata objects. The final specifications, tuning flags and documentation around Metaspace should be available once Java 8 is officially released. Metaspace: A new memory space is born The JDK 8 HotSpot JVM is now using native memory for the representation of class metadata and is called Metaspace; similar to the Oracle JRockit and IBM JVM's. The good news is that it means no more java.lang.OutOfMemoryError: PermGen space problems and no need for you to tune and monitor this memory space anymore…not so fast. While this change is invisible by default, we will show you next that you will still need to worry about the class metadata memory footprint. Please also keep in mind that this new feature does not magically eliminate class and classloader memory leaks. You will need to track down these problems using a different approach and by learning the new naming convention. I recommend that you read the PermGen removal summary and comments from Jon on this subject. In summary: PermGen space situation This memory space is completely removed. The PermSize and MaxPermSize JVM arguments are ignored and a warning is issued if present at start-up. Metaspace memory allocation model Most allocations for the class metadata are now allocated out of native memory. The klasses that were used to describe class metadata have been removed. Metaspace capacity By default class metadata allocation is limited by the amount of available native memory (capacity will of course depend if you use a 32-bit JVM vs. 64-bit along with OS virtual memory availability). A new flag is available (MaxMetaspaceSize), allowing you to limit the amount of native memory used for class metadata. If you don’t specify this flag, the Metaspace will dynamically re-size depending of the application demand at runtime. Metaspace garbage collection Garbage collection of the dead classes and classloaders is triggered once the class metadata usage reaches the “MaxMetaspaceSize”. Proper monitoring & tuning of the Metaspace will obviously be required in order to limit the frequency or delay of such garbage collections. Excessive Metaspace garbage collections may be a symptom of classes, classloaders memory leak or inadequate sizing for your application. Java heap space impact Some miscellaneous data has been moved to the Java heap space. This means you may observe an increase of the Java heap space following a future JDK 8 upgrade. Metaspace monitoring Metaspace usage is available from the HotSpot 1.8 verbose GC log output. Jstat & JVisualVM have not been updated at this point based on our testing with b75 and the old PermGen space references are still present. Enough theory now, let’s see this new memory space in action via our leaking Java program… PermGen vs. Metaspace runtime comparison In order to better understand the runtime behavior of the new Metaspace memory space, we created a class metadata leaking Java program. You can download the source here. The following scenarios will be tested: Run the Java program using JDK 1.7 in order to monitor & deplete the PermGen memory space set at 128 MB. Run the Java program using JDK 1.8 (b75) in order to monitor the dynamic increase and garbage collection of the new Metaspace memory space. Run the Java program using JDK 1.8 (b75) in order to simulate the depletion of the Metaspace by setting the MaxMetaspaceSize value at 128 MB. JDK 1.7 @64-bit – PermGen depletion Java program with 50K configured iterations Java heap space of 1024 MB Java PermGen space of 128 MB (-XX:MaxPermSize=128m) As you can see form JVisualVM, the PermGen depletion was reached after loading about 30K+ classes. We can also see this depletion from the program and GC output. Class metadata leak simulator Author: Pierre-Hugues Charbonneau http://javaeesupportpatterns.blogspot.com ERROR: java.lang.OutOfMemoryError: PermGen space Now let’s execute the program using the HotSpot JDK 1.8 JRE. JDK 1.8 @64-bit – Metaspace dynamic re-size Java program with 50K configured iterations Java heap space of 1024 MB Java Metaspace space: unbounded (default) As you can see from the verbose GC output, the JVM Metaspace did expand dynamically from 20 MB up to 328 MB of reserved native memory in order to honor the increased class metadata memory footprint from our Java program. We could also observe garbage collection events in the attempt by the JVM to destroy any dead class or classloader object. Since our Java program is leaking, the JVM had no choice but to dynamically expand the Metaspace memory space. The program was able to run its 50K of iterations with no OOM event and loaded 50K+ Classes. Let's move to our last testing scenario. JDK 1.8 @64-bit – Metaspace depletion Java program with 50K configured iterations Java heap space of 1024 MB Java Metaspace space: 128 MB (-XX:MaxMetaspaceSize=128m) As you can see form JVisualVM, the Metaspace depletion was reached after loading about 30K+ classes; very similar to the run with the JDK 1.7. We can also see this from the program and GC output. Another interesting observation is that the native memory footprint reserved was twice as much as the maximum size specified. This may indicate some opportunities to fine tune the Metaspace re-size policy, if possible, in order to avoid native memory waste. Now find below the Exception we got from the Java program output. Class metadata leak simulator Author: Pierre-Hugues Charbonneau http://javaeesupportpatterns.blogspot.com ERROR: java.lang.OutOfMemoryError: Metadata space Done! As expected, capping the Metaspace at 128 MB like we did for the baseline run with JDK 1.7 did not allow us to complete the 50K iterations of our program. A new OOM error was thrown by the JVM. The above OOM event was thrown by the JVM from the Metaspace following a memory allocation failure. #metaspace.cpp Final words I hope you appreciated this early analysis and experiment with the new Java 8 Metaspace. The current observations definitely indicate that proper monitoring & tuning will be required in order to stay away from problems such as excessive Metaspace GC or OOM conditions triggered from our last testing scenario. Future articles may include performance comparisons in order to identify potential performance improvements associated with this new feature. Please feel free to provide any comment.

February 11, 2013

by Pierre - Hugues Charbonneau

· 598,547 Views · 33 Likes

How Do You Organise Maven Sub-Modules?

Being an itinerant programmer one of the things I've noticed over the years is that every project you come across seems to have a slightly different way of organising its Maven modules. There seems to be no conventional way of characterising the contents of a project's sub-modules and not that much discussion on it either. This is strange, as defining the responsibilities of your Maven modules seems to me to be as critical as good class design and coding technique to a project's success. So, in light of this dearth of wisdom, here's my two penneth worth... When you first come across a new project, you'll generally find a layout convention that vaguely matches that defined by the Better Builds With Maven manual. The 'clean' project directory usually contains a POM file, a src folder and several sub-modules, each in their own subdirectory, as shown in the diagram below: If we all agree that this is the standard way of approaching the top level of project layout (and I have seen it done slightly differently) then there seems to be three different approaches taken when organising the responsibilities of each of a project's sub-modules. These are: Totally haphazardly. By class type. By functional area. I'm not going to linger on those projects that are organised seemingly without any structure or order except to say that they probably started off well organised but were not designed well enough to endure the changes forced upon them. In saying that a project's sub-modules are organised 'by class type', I mean that modules are used to group together all classes that comprise, but are not limited to, a layer in the program's architecture. For example a module could contain all classes that make up the program's service or persistence layers or a module could contain all model class (i.e. beans). Conversely, in saying that a project's sub-modules are organised by functional area I'm talking about a situation where each module contains, as close as possible, a vertical slice of the application, including model beans, service layer, controllers etc. If the truth be told then there are any number of ways to organise your project's sub-modules. Most project set-ups are fairly flat in structure, which is what I've demonstrated above; however, if you take a look at Erik Putrycz's 2009 talk Maven – or how to automate java builds, tests and version management with open source tools, he demonstrates that you can have modules within modules within modules. In order to explore this a little further, I'm going to invent my usual preposterously contrived scenario and in this scenario, you've got to write a program for a Welsh dental practice owned by a man called Jones also known locally as 'Jones The Driller'. The requirements would be pretty standard, I suspect, for a dental practice and would include handling: Patients details: name, address, DOB, phone number etc. Medical records, including treatments and outcomes. Appointments. Accounting, e.g. sales, purchase, wages etc. Auditing: as in who did what to whom... As a solution to Jones The Driller's problem, you propose that you write a multi-module web application based upon Spring, MVC and tomcat that, when assembled, has a standard 'n' tier design of a mySQL database, a database layer, service layer, a set of controllers and some JSPs that comprise the view. In creating your project your idea is to organise your sub-modules 'by class type' and you come up with the following module organisation, shown below roughly in build dependency order dentists-model dentists-utils dentist-repository dentists-services dentists-controllers dentists-web ...which on your screen looks something like this: Your dentists-model module contains the project's beans that model object used from the persistence layer right up to the JSPs. dentists-repository, dentists-services and dentists-controllers reflect the various layers of your application, with dentists-web module containing all the JSPs, CSS and other view paraphernalia. As for dentists-utils, well every project has a utils module where all the really useful, but disparate classes end up. Meanwhile, in a different universe, a different version of you decides to organise your project's sub-modules by functional area and you come up with the following breakdown: dentists-utils dentists-audit dentists-user-details dentists-medical-records dentists-appointments dentists-accounts dentists-repository dentists-integration dentists-web In this scenario, the build order is somewhat different; virtually all modules will depend upon dentists-utils and, depending upon your exact audit requirements, most modules will rely upon dentists-audit. You can also see in the following images that the sub-module package structure has been arranged on layer and type boundaries in that each module has its own model, repository (which contains interface definitions only) services and controller packages and that the layout of each module is identical at the top level. Another discussion to have here is the organisation of your project's package structure, where you can ask the same kind of questions: do you organise 'by class type' or 'by functional area' as shown above? You may have noticed that the dentists-repository modules can be fairly near the end of the build cycle as it only contains the implementation of the repository classes and not their interface definitions. You may have also noticed that dentists-web is again a separate module. This is because you're a pretty savvy business guy and in keeping the JSPs etc. in their own module, you hope to re-skin your app and sell it to that other Welsh dentist down the road: Williams The Puller. From a test perspective, each module contains its own unit tests, but there's a separate integration test module that, as it'll take longer to execute can be run when required. There are generally two ways of defining integration tests: firstly by putting them in their own module, which I prefer, and secondly by using a integration test naming convention such as somefileIT.java, and running all *IT.java files separately to all *Test.java files. Your two identical selves have proposed two different solutions to the same problem, so I guess that it's now time to takes a look at the pros and cons of each. Taking the 'by class type' solution first, what can be said about it? On the plus side, it's pretty maintainable in that you always known where to find stuff. Need a service class? Then that's in the dentist-service module. Also, the build order is very straight forward. On the down side, organisation 'by class type' is prone to problems with circular dependencies creeping in and classes with totally different responsibilities are all mixed up together making it difficult to re-use functionality in other projects without unnecessarily dragging in the who shebang. So, what about the pros and cons of the 'by functional area' approach? To my way of thinking, given the package structure of each module, it's just about as easy to locate a class using this technique as it is when using 'by class type'. The real benefit of using this approach is that it's far simpler to re-use a functional area of code in other projects. For example, I've worked on many projects in different companies and have implemented auditing several times. Each time I implement it I usually do it in roughly the same way, so wouldn't it be good just to reuse the first implementation? Not withstanding code ownership issues... The same idea also applies to dentists-user-details; the requirement to manage names and addresses applies equally as well to a shoe sales web site as it does a dental practice. And the downside? One of the benefits of this approach is that the modules are highly decoupled, but from experience no matter how hard you try, you always end up with more coupling that you'd like. You may have already spotted that both of these proposals are not 100% pure; 'by class type' contains a bit of 'by functionality' and conversely 'by functional area' contains a couple of 'by class type' modules. This may be avoidable, but I'm purposely being pragmatic. As I said earlier you always see a utils module in a project. Furthermore creating a separate database module allows you to change your project's database implementation fairly easily, which may make testing easier in some circumstances and likewise, having a separate web module allows you to re-skin your code should you be lucky enough to sell the same product to multiple customers with their own branding. Finally, one of the unwritten truths in software development is that once you've organised your project into its sub-modules you'll rarely get the opportunity to reorganise and improve them: there usually isn't the time or the political will as doing so costs money; however, it should be remembered that, in Agile terms, project module composition is, like code, a form of technical debt, which if done badly also costs you a lot of cash. It therefore seems a really good idea, as a team, to plan out your project thoroughly before starting to code. So be radical, do some design or have a meeting, you know it'll be worth it in the end.

February 8, 2013

by Roger Hughes

· 28,641 Views · 1 Like

How to Compress and Uncompress a Java Byte Array Using Deflater/Inflater

Here is a small code snippet which shows an utility class that offers two methods to compress and extract a Java byte array.

February 6, 2013

by Ralf Quebbemann

· 135,162 Views · 2 Likes

Debugging a Maven Build With mvnDebug

I ran into an interesting little debugging conundrum recently with IntelliJ’s debugger and maven. I was loading a .war within the pre-integration-test phase of another module so I could run a set of intergration tests with the failsafe plugin that depend on that webapp being available. The purpose was to have that webapp loaded for some black box system testing at the end of the maven build. There was a test failure that was localized to this particular set of tests that I couldn’t quite sort out immediately. As usual, the next logical step is to attach a debugger. Since these tests depended on the environment being setup and that environment setup was performed via maven, I couldn’t just Run->Debug on the test itself as I normally can with tests in IntelliJ. So in IntelliJ I first set my breakpoints, go to the maven pane, and hit debug on the install goal for system module. The set of tests run and none of my breakpoints were hit. Bummer. It seems as though IntelliJ is not running maven with the debugging agent enabled even though I’ve explicity hit “Debug” on a maven goal. I’m not entirely sure what was going on with IntelliJ in this instance but here’s how I got around it. Ever since some earlier version of maven (2.0.8, I believe), maven has shipped with a little known executable in its bin/ directory named mvnDebug. In a nutshell, mvnDebug runs maven in all of its glory but includes the debugging arguments the jvm needs to listen for a debugger setup for you conveniently. > alias mvnDebug=/usr/share/maven/bin/mvnDebug > mvnDebug clean install Preparing to Execute Maven in Debug Mode Listening for transport dt_socket at address: 8000 This ends up being the equivalent of invoking java with the proper debugging arguments: java ... -Xdebug -Xrunjdwp:transport=dt_socket,address=8000,server=y,suspend=y Then in IntelliJ (or your IDE/debugger of choice) set your breakpoints and connect your debugger to localhost:8000 and voila! My breakpoints ended up getting hit and that particular issue was resolved in a matter of minutes.

February 5, 2013

by Jason Whaley

· 71,861 Views · 3 Likes

Link List: MongoDB Drivers for Java

I am working on a Spring MVC app that demonstrates all of the different MongoDB Java APIs. Some Links http://code.google.com/p/morphia/wiki/QuickStart http://www.mongodb.org/display/DOCS/Java+Tutorial http://jongo.org/#updating https://github.com/hibernate/hibernate-ogm https://openshift.redhat.com/community/blogs/configuring-hibernateogm-for-your-jboss-app-using-mongodb-on-openshift-paas http://www.hibernate.org/subprojects/ogm.html https://github.com/impetus-opensource/Kundera/wiki https://github.com/impetus-opensource/Kundera/wiki/Getting-Started-in-5-minutes http://blog.fisharefriends.us/morphia-vs-spring-data-mongodb/ http://www.ibm.com/developerworks/java/library/j-morphia/index.html Maven POM Settings For Various Drivers morphia Morphia http://morphia.googlecode.com/svn/mavenrepo/ default sonatype-nexus Kundera Public Repository https://oss.sonatype.org/content/repositories/releases true false kundera-missing Kundera Public Missing Resources Repository http://kundera.googlecode.com/svn/maven2/maven-missing-resources true true com.google.code.morphia morphia 0.99 org.hibernate.ogm hibernate-ogm-core 4.0.0-SNAPSHOT provided org.mongodb mongo-java-driver 2.10.1 org.springframework.data spring-data-mongodb 1.0.4.RELEASE Hibernate OGM for MongoDB https://community.jboss.org/wiki/PortingSeamHotelBookingExampleToOGM https://github.com/ajf8/seam-booking-ogm https://openshift.redhat.com/community/blogs/configuring-hibernateogm-for-your-jboss-app-using-mongodb-on-openshift-paas https://github.com/openshift/openshift-ogm-quickstart Kundera (JPA for MongoDB) https://github.com/impetus-opensource/Kundera-Examples/wiki/Using-Kundera-with-Spring https://github.com/impetus-opensource/Kundera https://github.com/impetus-opensource/kundera-mongo-performance https://github.com/impetus-opensource/Kundera-Examples https://github.com/impetus-opensource/Kundera/wiki/Sample-Codes-and-Examples https://github.com/impetus-opensource/Kundera-Examples/wiki/Twitter https://dzone.com/articles/sqlifying-nosql-–-are-orm https://github.com/xamry/twitample https://github.com/impetus-opensource/Kundera-Examples/wiki/Cross-datastore-persistence-using-Kundera http://prabhubuzz.wordpress.com/2012/05/25/mongodb-cassandra-jpa-service-using-kundera/ http://gora.apache.org/ http://xamry.wordpress.com/2011/05/02/working-with-mongodb-using-kundera/ https://github.com/impetus-opensource/Kundera/wiki/Getting-Started-in-5-minutes https://github.com/impetus-opensource/Kundera/wiki/Concepts

February 1, 2013

by Tim Spann

CORE

· 3,971 Views · 1 Like

Using JAXB to Generate Java Objects from XML Document

Quite sometime back I had written about Using JAXB to generate XML from the Java, XSD. And now I am writing how to do the reverse of it i.e generating Java objects from the XML document. There was one comment mentioning that JAXB reference implementation and hence in this article I am making use of the reference implementation that is shipped with the JDK. Firstly the XML which I am using to generate the java objects are: Sanaulla Seagate External HDD August 24, 2010 6776.5 Benq HD Monitor August 24, 2012 15000 And the XSD to which it conforms to is: The XSD has already been explained and one can find out more by reading this article. I create a class: XmlToJavaObjects which will drive the unmarshalling operation and before I generate the JAXB Classes from the XSD, the directory structure is: I go ahead and use the xjc.exe to generate JAXB classes for the given XSD: $> xjc expense.xsd and I now refresh the directory structure to see these generated classes as shown below: With the JAXB Classes generated and the XML data available, we can go ahead with the Unmarshalling process. Unmarshalling the XML: To unmarshall: We need to create JAXContext instance. Use JAXBContext instance to create the Unmarshaller. Use the Unmarshaller to unmarshal the XML document to get an instance of JAXBElement. Get the instance of the required JAXB Root Class from the JAXBElement. Once we get the instance of the required JAXB Root class, we can use it to get the complete XML data in Java objects. The code to unmarshal the XML data is given below: package problem; import generated.ExpenseT; import generated.ItemListT; import generated.ItemT; import generated.ObjectFactory; import generated.UserT; import javax.xml.bind.JAXBContext; import javax.xml.bind.JAXBElement; import javax.xml.bind.JAXBException; import javax.xml.bind.Unmarshaller; public class XmlToJavaObjects { /** * @param args * @throws JAXBException */ public static void main(String[] args) throws JAXBException { //1. We need to create JAXContext instance JAXBContext jaxbContext = JAXBContext.newInstance(ObjectFactory.class); //2. Use JAXBContext instance to create the Unmarshaller. Unmarshaller unmarshaller = jaxbContext.createUnmarshaller(); //3. Use the Unmarshaller to unmarshal the XML document to get an instance of JAXBElement. JAXBElement unmarshalledObject = (JAXBElement)unmarshaller.unmarshal( ClassLoader.getSystemResourceAsStream("problem/expense.xml")); //4. Get the instance of the required JAXB Root Class from the JAXBElement. ExpenseT expenseObj = unmarshalledObject.getValue(); UserT user = expenseObj.getUser(); ItemListT items = expenseObj.getItems(); //Obtaining all the required data from the JAXB Root class instance. System.out.println("Printing the Expense for: "+user.getUserName()); for ( ItemT item : items.getItem()){ System.out.println("Name: "+item.getItemName()); System.out.println("Value: "+item.getAmount()); System.out.println("Date of Purchase: "+item.getPurchasedOn()); } } } And the output would be: Do drop in your queries/feedback as comments and I will try to address them at the earliest.

January 29, 2013

by Mohamed Sanaulla

· 169,743 Views

Java concurrency: the hidden thread deadlocks

Most Java programmers are familiar with the Java thread deadlock concept. It essentially involves 2 threads waiting forever for each other. This condition is often the result of flat (synchronized) or ReentrantLock (read or write) lock-ordering problems. Found one Java-level deadlock: ============================= "pool-1-thread-2": waiting to lock monitor 0x0237ada4 (object 0x272200e8, a java.lang.Object), which is held by "pool-1-thread-1" "pool-1-thread-1": waiting to lock monitor 0x0237aa64 (object 0x272200f0, a java.lang.Object), which is held by "pool-1-thread-2" The good news is that the HotSpot JVM is always able to detect this condition for you…or is it? A recent thread deadlock problem affecting an Oracle Service Bus production environment has forced us to revisit this classic problem and identify the existence of “hidden” deadlock situations. This article will demonstrate and replicate via a simple Java program a very special lock-ordering deadlock condition which is not detected by the latest HotSpot JVM 1.7. You will also find a video at the end of the article explaining you the Java sample program and the troubleshooting approach used. The crime scene I usually like to compare major Java concurrency problems to a crime scene where you play the lead investigator role. In this context, the “crime” is an actual production outage of your client IT environment. Your job is to: Collect all the evidences, hints & facts (thread dump, logs, business impact, load figures…) Interrogate the witnesses & domain experts (support team, delivery team, vendor, client…) The next step of your investigation is to analyze the collected information and establish a potential list of one or many “suspects” along with clear proofs. Eventually, you want to narrow it down to a primary suspect or root cause. Obviously the law “innocent until proven guilty” does not apply here, exactly the opposite. Lack of evidence can prevent you to achieve the above goal. What you will see next is that the lack of deadlock detection by the Hotspot JVM does not necessary prove that you are not dealing with this problem. The suspect In this troubleshooting context, the “suspect” is defined as the application or middleware code with the following problematic execution pattern. Usage of FLAT lock followed by the usage of ReentrantLock WRITE lock (execution path #1) Usage of ReentrantLock READ lock followed by the usage of FLAT lock (execution path #2) Concurrent execution performed by 2 Java threads but via a reversed execution order The above lock-ordering deadlock criteria’s can be visualized as per below: Now let’s replicate this problem via our sample Java program and look at the JVM thread dump output. Sample Java program This above deadlock conditions was first identified from our Oracle OSB problem case. We then re-created it via a simple Java program. You can download the entire source code of our program here. The program is simply creating and firing 2 worker threads. Each of them execute a different execution path and attempt to acquire locks on shared objects but in different orders. We also created a deadlock detector thread for monitoring and logging purposes. For now, find below the Java class implementing the 2 different execution paths. package org.ph.javaee.training8; import java.util.concurrent.locks.ReentrantReadWriteLock; /** * A simple thread task representation * @author Pierre-Hugues Charbonneau * */ public class Task { // Object used for FLAT lock private final Object sharedObject = new Object(); // ReentrantReadWriteLock used for WRITE & READ locks private final ReentrantReadWriteLock lock = new ReentrantReadWriteLock(); /** * Execution pattern #1 */ public void executeTask1() { // 1. Attempt to acquire a ReentrantReadWriteLock READ lock lock.readLock().lock(); // Wait 2 seconds to simulate some work... try { Thread.sleep(2000);}catch (Throwable any) {} try { // 2. Attempt to acquire a Flat lock... synchronized (sharedObject) {} } // Remove the READ lock finally { lock.readLock().unlock(); } System.out.println("executeTask1() :: Work Done!"); } /** * Execution pattern #2 */ public void executeTask2() { // 1. Attempt to acquire a Flat lock synchronized (sharedObject) { // Wait 2 seconds to simulate some work... try { Thread.sleep(2000);}catch (Throwable any) {} // 2. Attempt to acquire a WRITE lock lock.writeLock().lock(); try { // Do nothing } // Remove the WRITE lock finally { lock.writeLock().unlock(); } } System.out.println("executeTask2() :: Work Done!"); } public ReentrantReadWriteLock getReentrantReadWriteLock() { return lock; } } As soon ad the deadlock situation was triggered, a JVM thread dump was generated using JVisualVM. As you can see from the Java thread dump sample. The JVM did not detect this deadlock condition (e.g. no presence of Found one Java-level deadlock) but it is clear these 2 threads are in deadlock state. Root cause: ReetrantLock READ lock behavior The main explanation we found at this point is associated with the usage of the ReetrantLock READ lock. The read locks are normally not designed to have a notion of ownership. Since there is not a record of which thread holds a read lock, this appears to prevent the HotSpot JVM deadlock detector logic to detect deadlock involving read locks. Some improvements were implemented since then but we can see that the JVM still cannot detect this special deadlock scenario. Now if we replace the read lock (execution pattern #2) in our program by a write lock, the JVM will finally detect the deadlock condition but why? Found one Java-level deadlock: ============================= "pool-1-thread-2": waiting for ownable synchronizer 0x272239c0, (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync), which is held by "pool-1-thread-1" "pool-1-thread-1": waiting to lock monitor 0x025cad3c (object 0x272236d0, a java.lang.Object), which is held by "pool-1-thread-2" Found one Java-level deadlock: ============================= "pool-1-thread-2": waiting for ownable synchronizer 0x272239c0, (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync), which is held by "pool-1-thread-1" "pool-1-thread-1": waiting to lock monitor 0x025cad3c (object 0x272236d0, a java.lang.Object), which is held by "pool-1-thread-2" Java stack information for the threads listed above: =================================================== "pool-1-thread-2": at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x272239c0> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197) at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945) at org.ph.javaee.training8.Task.executeTask2(Task.java:54) - locked <0x272236d0> (a java.lang.Object) at org.ph.javaee.training8.WorkerThread2.run(WorkerThread2.java:29) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) "pool-1-thread-1": at org.ph.javaee.training8.Task.executeTask1(Task.java:31) - waiting to lock <0x272236d0> (a java.lang.Object) at org.ph.javaee.training8.WorkerThread1.run(WorkerThread1.java:29) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) This is because write locks are tracked by the JVM similar to flat locks. This means the HotSpot JVM deadlock detector appears to be currently designed to detect: Deadlock on Object monitors involving FLAT locks Deadlock involving Locked ownable synchronizers associated with WRITE locks The lack of read lock per-thread tracking appears to prevent deadlock detection for this scenario and significantly increase the troubleshooting complexity. I suggest that you read Doug Lea’s comments on this whole issue since concerns were raised back in 2005 regarding the possibility to add per-thread read-hold tracking due to some potential lock overhead. Find below my troubleshooting recommendations if you suspect a hidden deadlock condition involving read locks: Analyze closely the thread call stack trace, it may reveal some code potentially acquiring read locks and preventing other threads to acquire write locks. If you are the owner of the code, keep track of the read lock count via the usage of the lock.getReadLockCount() method I’m looking forward for your feedback, especially from individuals with experience on this type of deadlock involving read locks. Finally, find below a video explaining such findings via the execution and monitoring of our sample Java program.

January 28, 2013

by Pierre - Hugues Charbonneau

· 105,730 Views · 3 Likes