Blog Objective

This is a blog that attempts to make life easier by noting down the author's accrued knowledge and experiences.
The author has dealt with several IT projects (in Java EE and .NET) and is a specialist in system development.

08 September 2010

Operational Data Store (ODS)

What is an ODS?


An environment:
  1. where data from different OLTP databases is integrated
  2. which provides a view of enterprise data
  3. that addresses operational challenges across more than one business function

Characteristics of ODS:
  1. subject-oriented - catered to specific function or application (customer-centricity, risk management)
  2. integrated - from multiple legacy systems or new and legacy systems
  3. timely - data is continuously/ frequently being updated, typically more frequently than daily
  4. current - data is typically current with little history
  5. detailed - data is sufficiently detailed; not only at a summarized level
  6. central version of reference data
ODS should be a separate data store from the data warehouse.

Difference between ODS and Data Warehouse

ODSDW
Data CurrencyCurrent/ near-currentHistorical snapshot
Data LoadingInsert/ Update/ Deletion allowedOnly loaded

07 September 2010

Review: Agile Practices

A mindmap review of a book on agile practices.
Flash version can be found here

Review: Principles of Lean Software Development

The mindmap review of this book.
Flash version can be found here

Review: Peter Principle

A mindmap review of this humorous book.
Flash version can be found here

Review: Myth of Multitasking

A mindmap review of this book.
Flash version can be found here

Review: Leadership is an Art

A mindmap review of this book.
Flash version can be found here

Review: Carrot Principle

A mindmap review of this book.
Flash version can be found here

Review: Agile Practices

A mindmap review of a book abour Agile Practices.
Flash version can be found here

04 September 2010

Deciding when to use an Agile or Waterfall methodology

Comparison table:











Skill level of developers:












Possible hybrid approach for Brownfield projects
  1. Site survey
  2. Engineering
    • Discovery
    • Re-engineer
    • Generate
    • Test
  3. Acceptance
  4. Deployment

What went wrong with a particular government project?
  1. Requirement solicitation happened prior to any development (as part of the tender exercise)
  2. Vendor organisation (especially management) isn’t agile enough but customer insisted on unprecedented RAD approach
  3. Supposed prototyping team is
    • Overworked – had to develop the prototype during office hours and prepare for presentation after that
    • Not agile enough
    • Not trained to be agile

What went wrong with a particular private out-sourced project?
  1. Requirement solicitation happened prior to any development (as part of the requirements specification
  2. Requirements specification was contractual
  3. Customer isn’t agile enough and was not well-prepared for SCRUM (lack of training, knowledge and acceptance)
  4. Project started with Waterfall approach but changed to SCRUM mid-way
  5. Sprint demonstration failed as
    1. system tended to be buggy or less than ready for demonstration (confidence inevitably affected)
    2. customer expected the realization of requirements to be completely thought through and analyzed prior to sprints
    3. customer expected the solution to be as per previously agreed (prior to changing to SCRUM approach)
  6. Customer had difficulty appreciating & comprehending users’ stories compared to the original users’ requirements
  7. SCRUM approach ended abruptly for product line and project team took over development
  8. On-site customer for sprints was glaringly missing. Customer proxy, BA employed by the vendor, was not representative of the customer
  9. Customer was not involved in sprints except for the sprint demonstrations

What were common between the projects that led to failures?
  1. At least one of the parties (customer, development team or the management of both sides) was not prepared for Agile methods
  2. The onsite Customer was absent
  3. Projects were contractual in nature
  4. Requirements were solicited and agreed upon. Further, some aspects of the solution may have been agreed upon
  5. Customer knows (or at least, believes he/ she knows) what he/ she wants

asmx .NET Web Service Nuances

When the WSDL specifies that the minimum occurrence of certain elements is zero, the .NET WSDL proxy generator will generate 2 properties for that element (instead of 1).


On top of what usually gets generated – a property named according to the XML element – the generator creates another property named Specified of Boolean type.

As an example: the WSDL specifies an XML element named policy of type string with minimum occurrence of zero.

<xs:element minOccurs=”0” name=”policy” type=”xs:string” />
The generated proxy code will have a read/ write property named policy as expected
public string policy {

   get { return _policy; }
   set { _policy = value; }
}
Due to the minimum occurrence constraint, another read/ write property named policySpecified will be generated.
public bool policySpecified {
   get { return _policySpecified; }
   set { _policySpecified = value; }
}
The use of this property is to indicate to the framework that the particular property is in use (or has been specified). This will result in the framework generating the XML element in the SOAP message.
// the correct method of setting optional elements

soapInput.policy = “XXXX”;
soapInput.policySpecified = true;
Otherwise, the XML element will not be automatically generated in the SOAP message even though the property appears to have been set.
// will not result in the XML element appearing in the SOAP message

soapInput.policy = “XXXX”;

InfoPath 2007 Tips

To publish InfoPath forms to Sharepoint with time-stamped filenames:
  1. concat("Submission as at ", now()) as the filename in the publishing interface
To develop using Tools for Office SDK for InfoPath, make use of these:
  1. System.Environment.UserName to derive the logged-in username;
  2. thisXDocument.Role to derive the roles;
  3. thisXDocument.ViewInfos["View Name"] to get to the appropriate view;
  4. thisXDocument.DOM.selectSingleNode to select nodes based on the full XML document;
  5. docActionEvent.Source.selectSingleNode to select nodes based on the event parameter;
  6. Use ActiveX component to derive the domain and username if not using tools for office. Use ActiveXObject("Wscript.Shell") and the process environment.
  7. Calling web services from scripting code is not straight-forward and doing the same from the native InfoPath form without code is buggy!
  8. Monetary values should be dealt with using String instead of double as multiplication of double yields unexpected results
  9. GST multiplication may be as complicated as using string manipulation
concat(substring-before((. * 0.07), "."), ".", substring(substring-after((. * 0.07), "."), 1, 2))

var wshShell = new ActiveXObject("Wscript.Shell");
var wshEnv = wshShell.Environment("Process");
var username = wshEnv.Item("UserName");
var domain = wshEnv.Item("UserDomain");
var usernameElement = XDocument.DOM.selectSingleNode("//my:Username");

var fullLogin = domain + "\\" + username;
//XDocument.UI.Alert("Found Domain User: " + fullLogin);
usernameElement.text = fullLogin;
var getUserInfoDao = XDocument.DataObjects("GetUserInfo");
getUserInfoDao.DOM.setProperty("SelectionNamespaces", "xmlns:tns='http://schemas.microsoft.com/sharepoint/soap/directory/'");
getUserInfoDao.DOM.setProperty("SelectionLanguage", "XPath");
var inputParam = getUserInfoDao.DOM.selectSingleNode("//tns:GetUserInfo/tns:userLoginName");
inputParam.text = fullLogin;
var getUserInfoAdapter = getUserInfoDao.QueryAdapter;
getUserInfoAdapter.Query();
var username = getUserInfoDao.DOM.selectSingleNode("//tns:GetUserInfoResult/tns:GetUserInfo/tns:User/@Name").value
var preparedByNameElement = XDocument.DOM.selectSingleNode("//AuditTrail/PreparedBy/Name");
preparedByNameElement.text = username;

SharePoint 2007 Tips

To display the username in a Title column
  1. Choose Default value as Calculated Value;
  2. Type in: =REPLACE(Me,1,FIND("\",Me),"");
For Team Discussion to work
  1. Subject field is used only in the first article within the thread;
  2. A web-part would mainly use the Subject and Replies fields for listing;
  3. Threaded view – which displays only Threading – is the only useful view;
To sort a list by the abbreviated month
  1. Create a choice type for month input ([Report Mth]). The values should be the abbreviated month (e.g. Jan, Feb, Mar, etc);
  2. Create a calculated column to assign a numeric month to the list;
  3. The formulae should be: =MONTH("01-"&[Report Mth]&"-"&1990);
To find the difference between 2 dates
  1. Use the DATEDIF function on 2 date fields
  2. DATEDIF(d1 : Date, d2: Date, “D”) : Number
  3. Example: DATEDIF(dateColumn, [Today], “D”)
To convert a text value (from InfoPath) to numeric
  1. Create a calculated column of numeric type
  2. Apply formula =VALUE([Column Name])
To apply a filter to test for empty values
  1. Use a field of the following types: Date, Text Column, Numeric
  2. Leave the value empty (to test against empty/ null value)
To apply a filter with complex AND/ OR conditions
  1. AND takes precedence over OR
  2. For such condition: X AND (Y OR Z)
  3. Use X AND Y OR Z AND Z in filter
To apply a filter on view based on user
  1. Ensure a column of Person type is available
  2. Apply [Me] as a filtering criteria for the View
To hide the Title column in a list
  1. Allow Management of Content Type for the list
  2. List Item Content Type (which contains the Title column) can now be edited
  3. Change the Title column to Hidden
To create a customed calculated ID (e.g. calculated column, Create date in yyMMdd, 4 character ID prepended with 0) in a list
  1. Create a Calculated Column
  2. Use the following for formula:
=[My Column]&"-"&TEXT([Created],"yyMMdd")&"-"&LEFT("0000",4-LEN([ID]))&[ID]

To synchronise a list with Excel 2007 (Excel 2003 is able to do so internally)
  1. Install the Excel add-in (XL2007SynchronizeWSSandExcel.exe) and follow the installation instructions (http://msdn.microsoft.com/en-us/library/bb462636(office.11).aspx)
  2. Install an ActiveX (http://www.softfluent.com/wsslists.htm) to reroute all .IQY to Excel
  3. From Excel, choose the Table option to Synchronize with SharePoint

03 September 2010

Database Best Practices

This is summarised from a book titled: Data Modeling
Some best practices are described below:


Database indexing
  • index foreign keys
  • index on columns with a lot of null values is useless
  • frequently updated columns should not be indexed
  • may not be a bad idea to use table scans for small tables (less than 1K rows)
  • short-rowed tables (few columns) should use index-organised table
  • b-tree index benefits performance if values are selective (distinct). The higher the index selectivity ratio, the better

Database views
  • perform better than SQL statements since views are pre-compiled (but Oracle does cache statements)
  • stored procedures perform better than views generally

Naming convention
  • Constraint: <TableName>_<Type>_<ColumnName> where may be PK, FK, UQ (unique constraint), CK (check constraint)
  • Index: <TableName>_<Type>_<ColumnName> where may be UX (unique index), IX (non-unique)
  • View: <EntityOrTableName>_VW


Code table structure

Many applications require the use of code tables. Instead of creating many different code tables, an alternative is to create a generic code table structure.

The structure is as follows:
  • CodeItem is used to store the code types
  • CodeItemValue is used to store the code key-value pairs



An example of its use follows:
  • codeItem.itemName = "Country"
  • codeItemValue.codeValue = "SG"; codeItemValue.codeValueDesc = "Singapore"
  • codeItemValue.codeValue = "MY"; codeItemValue.codeValueDesc = "Malaysia"

System Architecture and Design Trade-off Document

It is a great idea to write a System Architecture & Design (SAD) Trade-off Document.

The format is tabular and will look like the following:
  1. Module/ category
  2. Issue description
  3. Possible alternatives
  4. Decision
  5. Decision rationale
  6. Traded-off attributes
  7. Traded-in attributes
  8. Consequence/ constraints introduced
Some examples of traded attributes are listed in the following table:
CategoryAttribute
System PerformanceReliability - ability of the system to maintain operating over time (MTTF)
Performance - responsiveness of the system to stimuli or events as well as throughput of the system
System ControlMaintainability - ease with which a system can be modified to correct faults, improve performance, or adapt to changing environment
Data timeliness - data latency for information flowing into and out of the system
Security - measure of system's ability to resist unauthorised access or DOS
Supportability - ease with which a system can be maintained operationally
Testability - ease of system testing
Usability - measure of a user's ability to utilise a system effectively
Decoupling - measure of how systems are independent of one another
Data integrity - integrity of the overall structure
Functionality/ feature - ability of the system to do the work for which it was intended
Familiarity - based on existing skill-set of the developers
Cost reduction - reduction in general project cost to meet timeline or due to cost constraint
Effort reduction - reduction in general project cost to meet timeline or due to cost constraint
System EvolutionExtensibility - ability of the system to extend for enhancements
Reusability - degree to which an artefact can be used in other systems
Flexibility - ease with which the system can be modified for use in environment not originally intended for
Interoperability - ability of the system to exchange information with other systems
The rationale for writing this is as follows. The document serves as a means for:
  1. communication to the rest of the team regarding important principles and decisions
  2. putting the team on the "same page" regarding the rationale behind decisions impacting system architecture and design
  3. communicating with senior management & stakeholders who may be interested in the decisions
  4. future hand-over to another team or posterity
  5. forcing explicit reasoning to validate decisions
  6. re-evaluating a decision when conditions change in the future
A sample follows:

Deferring Decisions

At times, it is a good idea to deferred decisions until the last responsible moment.
In so doing, more information may be made available such that a more informed decision can be made.

Deferred Decision - Delay commitment (decision or making a choice) until the last responsible moment (when inaction would results in a potentially irreversible outcome)

Difference between a Report and a Query

Apart from the nomenclature difference, there are some distinctions between a report and a query.
This following table summarises the difference:

Unit testing legacy code

What is legacy code?
Some define legacy code as code without proper unit tests.

Consider these:
  • Did you just write some legacy code yesterday?
  • What happens when you are tasked to take over the maintenance of someone else's code; someone's legacy code?
  • What happens when you need to modify someone's legacy code?
Some steps to take when planning to unit test legacy code:
  1. identify the area of change
  2. build safety net over the area before touching/ changing it
  3. refactor the code to ease adding new code
  4. write unit test for issue
  5. write code for fix/ enhancements
Some links to refer to:

Using iPad for the Insurance industry?

Some links to using iPad for the enterprise/ insurance
Links to iPad for eSignature
Concerns:
Web service (in native form) may not be suitable for data transfer especially for scanned attachments. This is because the use of Base-64 will increase the original file size significantly. MTOM may not be easily implemented for the iPad without the appropriate framework. Binary Transfer may be considered:

SCM: Use a Branch or a Tag?

When do we use a branch or a tag for source code management?

Salient features for each are listed.

Tag
  1. code snapshot for a short duration of time
  2. tag gives more control to developers
Branch
  1. code development isolated from the main trunk
  2. particularly for enhancements
  3. work to be done on a historical version
  4. major code changes
  5. concurrent multi-user development (with the main trunk)

Integration Strategies

Types of integration
Data-level
  1. Share data; not behavior
  2. Minimal change (if any) to both source and target systems
  3. Can be database or file-based
  4. ETL
  5. File-data transfer
  6. Direct database access
    1. Bypass business logic (may need to duplicate)
    2. Overall data integrity may be compromised
    3. If writable, may lead to data corruption and referential integrity violations
  7. Capabilities include
    1. Data transformation
    2. Data validation
    3. Data access
    4. Schema definition
    5. Mapping
    6. Schema recognition

Application-level

  1. Share functionality – business logic
  2. Based on API
  3. Composite applications

Business Process level
  1. Share business processes
  2. Specified using BPMN
  3. Glued together using BPEL(4WS) and BPML
  4. Start by defining business processes; then specify logical integration within it
  5. Capabilities include:
    1. Rules processing
    2. Business transaction management
    3. Workflow
    4. Orchestration
    5. Event processing
    6. Schedule

Presentation
  1. Share views
  2. Using a portal
  3. Non-invasive

Application Integration methods
  1. Web services - Application integration
  2. ETL
    1. Data integration
    2. Consolidation of multiple data sources
  3. Communication message protocol - E.g. HTTP, TCP/IP, FTP
  4. Screen-scraping
  5. Program calls - Application integration
  6. Direct data access - Data integration
  7. File transfer - Unidirectional batch file transfer
  8. Human intervention

Enterprise Integration Patterns

Deciding factors
  1. Application coupling
  2. Intrusiveness
  3. Technology selection
  4. Data format
  5. Data timeliness – Latency
  6. Data or functionality required
  7. Remote communication – synchronous or asynchronous
  8. Reliability
Choices

Order is in increasing sophistication and complexity


  1. File transfer –
    1. Simplicity, applications are decoupled (availability does not matter), platform and implementation independence
    2. lacks timeliness (data integrity caused by stale data); may have data semantic dissonance; huge dataset duplicated
  2. Shared database –
    1. enforce agreed upon data format and allow speedy implementation, no semantic dissonance; no data replication; more timely
    2. difficult to design shared schema; maybe dependent on software upgrade; allowing writes may cause deadlocks; may result in performance issue; applications are coupled to shared database; no collaboration
  3. Remote procedure invocation –
    1. shared functionality, maintain data integrity, no semantic dissonance
    2. tight coupling between applications (availability matters), prone to failure & difficult to maintain without management infrastructure
  4. Messaging –
    1. frequent exchanges of small messages, storage schema can be changed, asynchronous with retries; less decoupled than remote procedure; more reliable
    2. semantic dissonance still occurs; difficult to test & debug

Integration Styles and Strategies
These are the few integration styles that are widely used.

When deciding the appropriate style to use, always consider the pros and the cons based on the various criteria. The implication is this: one size doesn't fit all.

The following table summarises the points for consideration: