Archive | November, 2007

Web Utility Package

Http Proxy Servlet
Both Flex and AJAX have Browser security restriction that can only allow them to access the web server where they originally come from. In order for your Flex and AJAX app to access the web service on the Net, you need to deploy a proxy onto the web server they can access. In PHP, you can use this php proxy (code). In Java, you can use this proxy servlet (code).

Shell Servlet
If you want to issue commands to the web server and tell it to run this command. You can do that with Shell Servlet. With this, you can do Web Administration online, even through WAP phone. 

Reference:

http://www.servletsuite.com/servlets.htm

Leave a comment Continue Reading →

Web Technology – Application Events

Web application events
In  servlet 2.3 spec, web application events are introduced that give you greater degree of control over your web application. The two important application events are:

  1. Application startup and shutdown
  2. Session creation and invalidation

As their names suggest, application startup event occurs when your web application is first loaded and started by the Servlet container and application shutdown event occurs when the web application is shutdown.

Session creation event occurs every time a new session is created on the server and similarly session invalidation event occurs every time a session is invalidated. To make use of these web application events and to do something useful you’ll have to create and make use of special “listener” classes.

Listener
These are simple Java classes which implement one of the two following interfaces:

  1. javax.servlet.ServletContextListener
  2. javax.servlet.http.HttpSessionListener

If you want your class to listen for application startup and shutdown events then implement ServletContextListener interface. If you want your class to listen for session creation and invalidation events then implement HttpSessionListener interface.

ServletContextListener – This interface contains two methods :
[code]]czoxMTI6XCJwdWJsaWMgdm9pZCBjb250ZXh0SW5pdGlhbGl6ZWQoU2VydmxldENvbnRleHRFdmVudCBzY2UpOw0KcHVibGljIHZvaWR7WyYqJl19IGNvbnRleHREZXN0cm95ZWQoU2VydmxldENvbnRleHRFdmVudCBzY2UpO1wiO3tbJiomXX0=[[/code]

HttpSessionListener – This interface contains two methods also:
[code]]czoxMDA6XCJwdWJsaWMgdm9pZCBzZXNzaW9uQ3JlYXRlZChIdHRwU2Vzc2lvbkV2ZW50IHNlKTsNCnB1YmxpYyB2b2lkIHNlc3Npb257WyYqJl19RGVzdHJveWVkKEh0dHBTZXNzaW9uRXZlbnQgc2UpO1wiO3tbJiomXX0=[[/code]

Example of usage – use HttpSessionListener to count how many active session[code]]czo5MzpcIg0KaW1wb3J0IGphdmF4LnNlcnZsZXQuaHR0cC5IdHRwU2Vzc2lvbkxpc3RlbmVyOw0KaW1wb3J0IGphdmF4LnNlcnZsZXR7WyYqJl19Lmh0dHAuSHR0cFNlc3Npb25FdmVudDtcIjt7WyYqJl19[[/code][code]]czozNzA6XCJwdWJsaWMgY2xhc3MgU2Vzc2lvbkNvdW50ZXIgaW1wbGVtZW50cyBIdHRwU2Vzc2lvbkxpc3RlbmVyIHsNCnByaXZhdGV7WyYqJl19IHN0YXRpYyBpbnQgYWN0aXZlU2Vzc2lvbnMgPSAwOw0KLyogU2Vzc2lvbiBDcmVhdGlvbiBFdmVudCAqLw0Kw4LCoHB1YmxpYyB2b3tbJiomXX1pZCBzZXNzaW9uQ3JlYXRlZChIdHRwU2Vzc2lvbkV2ZW50IHNlKSB7DQrDgsKgIGFjdGl2ZVNlc3Npb25zKys7DQrDgsKgfQ0KLyoge1smKiZdfVNlc3Npb24gSW52YWxpZGF0aW9uIEV2ZW50ICovDQrDgsKgcHVibGljIHZvaWQgc2Vzc2lvbkRlc3Ryb3llZChIdHRwU2Vzc2lvbkV7WyYqJl19dmVudCBzZSkgew0Kw4LCoCBpZihhY3RpdmVTZXNzaW9ucyAmZ3Q7IDApDQrDgsKgIGFjdGl2ZVNlc3Npb25zLS07DQrDgsKgfVwiO3tbJiomXX0=[[/code][code]]czo3ODpcInB1YmxpYyBzdGF0aWMgaW50IGdldEFjdGl2ZVNlc3Npb25zKCkgew0Kw4LCoCByZXR1cm4gYWN0aXZlU2Vzc2lvbnM7DQp7WyYqJl19w4LCoH0NCn1cIjt7WyYqJl19[[/code]Register listener to the web application[code]]czo2NzA6XCINCiZsdDshLS0gV2ViLnhtbCAtLSZndDsNCiZsdDs/eG1sIHZlcnNpb249XCIxLjBcIiBlbmNvZGluZz1cIklTTy04ODU5LTF7WyYqJl19XCI/Jmd0Ow0KJmx0OyFET0NUWVBFIHdlYi1hcHANCsOCwqBQVUJMSUMgXCItLy9TdW4gTWljcm9zeXN0ZW1zLCBJbmMuLy9EVEQgV2Vie1smKiZdfSBBcHBsaWNhdGlvbiAyLjMvL0VOXCINCsOCwqBcIjxhIGhyZWY9XCJodHRwOi8vamF2YS5zdW4uY29tL2oyZWUvZHRkcy93ZWItYXBwXzJ7WyYqJl19LjMuZHRkXCI+aHR0cDovL2phdmEuc3VuLmNvbS9qMmVlL2R0ZHMvd2ViLWFwcF8yLjMuZHRkPC9hPlwiJmd0OyZsdDt3ZWItYXBwJmd0e1smKiZdfTsmbHQ7IS0tIExpc3RlbmVycyAtLSZndDsNCsOCwqAmbHQ7bGlzdGVuZXImZ3Q7DQrDgsKgw4LCoCZsdDtsaXN0ZW5lci1jbGFzcyZ7WyYqJl19Z3Q7DQrDgsKgw4LCoGNvbS5zdGFyZGV2ZWxvcGVyLndlYi5saXN0ZW5lci5TZXNzaW9uQ291bnRlcg0Kw4LCoMOCwqAmbHQ7L2xpc3tbJiomXX10ZW5lci1jbGFzcyZndDsNCsOCwqAmbHQ7L2xpc3RlbmVyJmd0Ow0Kw4LCoCZsdDtsaXN0ZW5lciZndDsNCsOCwqDDgsKgJmx0O2xpe1smKiZdfXN0ZW5lci1jbGFzcyZndDsNCsOCwqDDgsKgY29tLnN0YXJkZXZlbG9wZXIud2ViLmxpc3RlbmVyLkFwcGxpY2F0aW9uV2F0Y2gNCsN7WyYqJl19gsKgw4LCoCZsdDsvbGlzdGVuZXItY2xhc3MmZ3Q7DQrDgsKgJmx0Oy9saXN0ZW5lciZndDsmbHQ7L3dlYi1hcHAmZ3Q7XCI7e1smKiZdfQ==[[/code]

Spring Events
Spring provides a simple mechanism for sending and receiving events between beans. To receive an event, a bean implements ApplicationListener, which has a single method:
[code]]czo1NTpcInB1YmxpYyB2b2lkIG9uQXBwbGljYXRpb25FdmVudChBcHBsaWNhdGlvbkV2ZW50IGV2ZW50KTtcIjt7WyYqJl19[[/code]
To publish events to listeners you call the publishEvent() method the ApplicationContext. This will publish the same event to every listener in the context. Event listeners receive events synchronously. This means the publishEvent()  method blocks until all listeners have finished processing the event. it is possible to supply an alternate event publishing strategy via a ApplicationEventMulticaster implementation. Furthermore, when a listener receives an event it operates inside the transaction context of the publisher, if a transaction context is available.

You can be both listener or publisher. If it is a publisher, it needs to have access to the ApplicationContext. This means that beans will have to made aware of the container that they are running in. You can create your own custom event via extends ApplicationEvent class. In addition to events that are published by other beans, the Spring container itself publishes a handful of events during the course of an application’s lifetime. These application events include:

  1. ContextClosedEvent – publish when the application context is closed
  2. ContextRefreshEvent – publish when the application context is initialized or refreshed
  3. RequestHandledEvent – publish when a request is handled

Put them together
Look at how acegi publish session creation/destroy events to the bean(s) listening on this. It has HttpSessionEventPublisher class that implements HttpSessionListener. So, web container will trigger its sessionCreated and sessoinDestroy methods when session is created an destroyed. Within these methods, the publisher will use ApplicationContext to publish its own HttpSessionCreatedEvent and HttpSessionDestroyedEvent to all the spring bean(s) listening on these. (code)

Reference
http://java.sys-con.com/read/171482_1.htm (Acegi)
http://www.acegisecurity.org/articles.html (Acegi)

Leave a comment Continue Reading →

Designer Best Practices

Software development is an art to me. To develop an application to get the current job done may not pose a challenge. However, if you want to develop an application that can be easily extended for the future requirements, you may need to give it a thought before coding. This article is not going to teach you OO design b/c you will find it from many good books. It is also not teaching you design pattern b/c you are not presenting a problem to solve here. What this article tells you is some design guidelines that I have learnt so far from reviewing the architectural work of some successful open source solutions like Quartz and Struts. My goal is to help you to develop an application that can embrace changes and gradually grow itself to become something solid and mature to ship. Many good design methodologies today are to help you to do one thing – minimize the code change. When your application grows in size and complexity, you want to manage its complexity because you don’t want your development time proportionally increases with it. To manage complexity, there are practices you may learn in old days like:

  1. Encapsulation – like you use the car braking system without necessary knowing the detail how it functions.
  2. Increase cohesion and decrease coupling – so you can easily spot the piece of the code that need to change and minimize the ripple effect during code change via reducing inter-component dependency. This is the driving force for component design behind the scene.
  3. Apply design patterns – with patterns you make your code easily to understand for someone who knows design patterns. That will dramatically improve communication and code clarity. The key is you don’t need to re-invent the wheel to solve the solved problem.
  4. Code against interface - make the implementation details hidden and replaceable. Unit testing will become easier. With the advent of IoC, you can inject the concrete implementation during runtime. So, you don’t need to do any code change to replace implementation.

These are golden rules that I have followed and proved working. Below are some of the guidelines that I have noticed in this industry after years of evolution around these concepts.

  1. Look at the volatile part of your code and externalize it (eg. xml) – like workflow BPEL.
  2. AOP to factor out the cross-cutting concerns from the code like security, logging, transaction management. This will help you to focus on the business logic and put your system in higher consistency. This concept is further enchanced by Annotation.
  3. IoC for object creation and lifecycle management.
  4. Put extenstion points into your system – listener and plugin API. Listener is used to register notification points during state change of a subject. For example, you will see people put listener for session creation and destroy. Plugin is to extend the capability of the system without modifying the existing code.
  5. Wire up components in declarative fashion via meta language like Pentaho action sequence in reporting.

Those are the guidelines I have found so far. Suggest me some more to make it completed.

 Reference

http://www.onjava.com/pub/a/onjava/2004/11/10/ExtendingStruts.html

Leave a comment Continue Reading →

Open source Enterprise Content Management – Alfresco

Enterprise without ECM

  1. A search results in finding three or four documents with the same name, making it impossible to tell which is the correct version. This premise leads to the alarming statistic that 42% of employees accidentally use the wrong information at least once a week. Such mistakes can have a significant impact on a business.
  2. Document has no version control. User cannot return to previous document versions even after a document has been modified and saved. Furthermore, more than one user may edit the same shared document at the same time causing data lost issue that can be resolved via locking facility.
  3. Weak search capability. User is not able to find the document they are looking for in ease.
  4. Weak protection. User may access the document they are not supposed to see as the access control is not sophisticated.
  5. No workflow associated to the documents. Normally, you will create several folders like draft, review, authorized to simulate it.
  6. No audit info like capturing who a document was authored by and when, who reviewed it and what comments were made and when, when was it approved and when was it made public.
  7. Hard to access

The heart of ECM is the Document Model

  1. Every document lives in a folder
  2. Every document has different properties (meta-data).
  3. Every document can be accessed by different users, groups or roles.
  4. Every document may have comments about it or be in a different format for example PDF.
  5. Every document has security.
  6. Every document has a version control and audit around it
  7. Every document can be logged.

That’s a basic document model and is what has to be managed in a document management system.

Alfresco – open source ECM

Now we all notice the benefit of ECM. Why hasn’t much companies adopted it? It is because traditional ECM is too complicated to use. That is the reason why Alfresco gets into the picture. To address this, Alfresco has the following features:

  1. It makes its repository look like a share file drive. The Alfresco repository has a native CIFS (Common Internet File System) shared drive interface. This has been achieved using pure Java. This allows a user to literally go to their C: drive or desktop and drag-and-drop
    content into Alfresco, or they can simply select File/Save As. This is exactly like using a shared drive. Because of this a user can reuse their desktop and Microsoft Briefcase. They can even drag content out of Alfresco straight into the briefcase and it works because the Briefcase thinks it is accessing a Microsoft shared file drive.
  2. It makes workflow easy to use on document.
  3. It enables automatic meta-data extraction. If a user drags a Word document from their desktop into Alfresco, they can automatically extract the meta-data or properties. There may be content owner’s name and keywords or document titles and it can all be extracted without being retyped into Alfresco, enabling the user can search upon them at a future date.
  4. Associate rule to trigger workflow. For example, when a document is dropped into a particular folder a workflow will be started to review it. This is all possible because the rules are stored and maintained at the server level.

Deployment Diagram

alfresco1.JPG

Architecture Overview

Alfresco is built on top of lots of open source solutions like spring, lucene, hibernate…

alfresco21.JPG

 

Leave a comment Continue Reading →

Basic hardware knowledge

What should we look at for a machine?

  1. CPU (how many core, how many physical cpu(s), how fast, 64 bits?, cache size)
  2. Memory RAM
  3. IO speed 

Dual-core CPU vs multiprocessor

A dual-core CPU is a CPU with two separate cores on the same die, each with its own cache. It’s the equivalent of getting two microprocessors in one. Multi-processor has 2 or more CPUs physically on the motherboard.  An attractive value of dual core processor is that it does not require new motherboard. For now, I have heard of quad-core CPU as well (4x). For example, Dell PowerEdge 6950 has 4 quad-core AMD Opteron 8300 processors.In a single-core or traditional processor the CPU is fed strings of instructions it must order, execute, then selectively store in its cache for quick retrieval. When data outside the cache is required, it is retrieved through the system bus from random access memory (RAM) or from storage devices. Accessing these slows down performance to the maximum speed the bus, RAM or storage device will allow, which is far slower than the speed of the CPU. The situation is compounded when multi-tasking. In this case the processor must switch back and forth between two or more sets of data streams and programs. CPU resources are depleted and performance suffers.In a dual core processor each core handles incoming data strings simultaneously to improve efficiency. Now when one is executing the other can be accessing the system bus or executing its own code. Adding to this favorable scenario, both AMD and Intel’s dual-core flagships are 64-bit.To utilize a dual core processor, the operating system must be able to recognize multi-threading and the software must have simultaneous multi-threading technology (SMT) written into its code. SMT enables parallel multi-threading wherein the cores are served multi-threaded instructions in parallel. Without SMT the software will only recognize one core.

Memory is important

If you cache stuff, you need memory. Cache is one of the key weapons to boost your performance. It can reduce the number of database calls dramatically and eliminate unnecessary query processing time. So, you want to buy your system more RAM and it is cheap too!

Disk Storage

When we talk about IO, we are looking at the power of our storage device. Here we are going to understand some of the common terms

  1. SCSI – It’s a fast bus that can connect lots of devices to a computer at the same time, including hard drives, scanners, CD-ROM/RW drives, printers and tape drives. Other technologies, like serial-ATA (SATA), have largely replaced it in new systems, but SCSI is still in use.  
  2. RAID – SCSI is often used to control a redundant array of independent discs (RAID). Other technologies, like serial-ATA (SATA), can also be used for this purpose. A RAID is a series of hard drives treated as one big drive. Disk arrays stripe data across multiple disks and access them in parallel to achieve high throughput for the system. But large disk array is highly vulnerable to disk failure. The solution to the problem of lower reliability in disk arrays is to improve the availability of the system via redundancy (fault tolerant!). However, redundancy has its disadvantage of lowering the write performance because of maintaining the consistency across 2 replica. Data striping (0) - improve performance; Redundancy via mirroring (1) improves availability. RAID 01 – mirror of strips, RAID 10 – strip of mirror. For 10 machines, you adopt RAID 10. You will have 5 set of mirrors and each set contain its own strips. Clearly, RAID 1+0 is more robust than RAID 0+1.
  3. SAS – The newest type of SCSI, called Serial Attached SCSI (SAS), uses SCSI commands but transmits data serially. SAS uses a point-to-point serial connection to move data at 3.0 gigabits per second.
  4. SATA vs SAS. In term of storage (GB/$), SATA is a LOT better than SAS. But in term of performance, SATA is 10K rpm is slower than SAS (15K rpm). In addition, SAS is more reliable.
  5. iSCSI – iSCSI is one of two main approaches to storage data transmission over IP networks; the other method, Fibre Channel over IP.
  6. SAN – Storage Area Network (SAN) is a high-speed subnetwork of shared storage devices. A storage device is a machine that contains nothing but a disk or disks (disk array) for storing data. A SAN’s architecture works in a way that makes all storage devices available to all servers on a LAN or WAN. As more storage devices are added to a SAN, they too will be accessible from any server in the larger network. In this case, the server merely acts as a pathway between the end user and the stored data.
Leave a comment Continue Reading →