27 May 2010

ACCESS DENIED: Project currently locked. The project is currently locked by the Administrator.

Problem(Abstract)
IBM® InfoSphere™ DataStage®: If there are a large number of users performing concurrent activities such as logging into a client, job design, job compilation, some of these activies can fail.

Symptom
The activities can fail with one or more of the following error messages:
  • Error calling subroutine DSR_JOB (Action=5); check DataStage is set up correctly in project dstage_perf (Subroutine failed to complete successfully (30107)) (This problem can occur on any operating system)
  • ACCESS DENIED: Project currently locked. The project is currently locked by the Administrator. Try again later or see your Administrator.
  • Record J\x\yyyyyy in file DS_JOBOBJECTS is locked by another user

Remove Duplicate and Sort Warning Message

Error Message:
rmd_msisdn_y: When checking operator: User inserted sort "srt_EFF_DT" does not fulfill the sort requirements of the downstream operator "rmd_msisdn_y"


24 May 2010

ETL Design Consideration - Multiple Steps/Stages/Area Required for Different Purposes

Normally we'll need multiple steps/stages/area which can be flat files, dataset or RDBMS to allow us to achieve the followings -
  1. Reusability
  2. Restartablility
  3. Performance
  4. Quality Assurance
  5. Reconciliation
  6. Data Integrity
  7. Change Detection
RDBMS is typically the recommended method of storage of all the above steps/stages/area as the End Stage while datasets are preferred for Intermediate Stage.

The name maybe different but the functions/purposes/needs should be the same -
  1. Staging Area - This is the place where we load the source files into RDBMS. RDBMS is used to allow us to reduce the impact on ETL Jobs when source files change. With this RDBMS, we just need to change the Loading ETL jobs which is simple as it is typically 1 to 1 simple load. It also facilitate in reconciliation, quality assurance and testing as we can make queries in RDBMS compare to flat files.
  2. Working Area - This is the place where we will stored the X-Ref table which is used to standardized all the codes and also all the Intermediate Tables/datasets that are used for ETL to produce the final tables for Holding Area. Source Change Detection will be done here if the Source cannot provide only records that has been change.
  3. Holding Area - This is the final step before moving the data into SOR. This area have the tables totally the same as SOR and normally we will keep only 1 day of data. This area allow us to do the follow quality assurance, reconciliation, data integrity and change detection. It also provide better performance and flexibility for loading as the Holding Area may not need all the indexes that SOR required and it will also prevent locking of tables which is a requirements for LOAD that's not good for SOR if the loading happened in office hours.
  4. SOR - This is where our Industry Models will be. Due to our highly normalize nature of our Industry Model it maybe difficult for user to use and most of the time even for data modeler to understand as our model come with hundreds of entities and thousands of attributes. We will need to build the necessary Views to help simplify the model to allow better usage.
  5. Data Marts - This is normally optional for high performance RDMS like DB2 with DPF or our main competitors Teradata. We will use Views, MQT, Cube Services, MDC etc. to reduce the requirements to build Physical Data Marts. The requirements of Physical Data Marts will also increase efforts and availability of the system in term of ETL.

08 May 2010

How to Restart IIS

1. login as dsadm - Stop DS
[dsadm@ETL_Dev_1 script]$ cd $DSHOME
[dsadm@ETL_Dev_1 DSEngine]$ . ./dsenv
[dsadm@ETL_Dev_1 DSEngine]$ ./bin/uv -admin -stop

2. login as root (should be dsadm) - Stop ASB Node
[root@ETL_Dev_1 bin]# cd /opt/IBM/InformationServer/ASBNode/bin
[root@ETL_Dev_1 bin]# ./NodeAgents.sh stop
Agent stopped.
rm: remove write-protected regular file `Agent.pid'? y
LoggingAgent stopped.
rm: remove write-protected regular file `LoggingAgent.pid'? y

3. login as root (should be dsadm) - Stop Metadata Server
[root@ETL_Dev_1 bin]# cd /opt/IBM/InformationServer/ASBServer/bin
[root@ETL_Dev_1 bin]# ./MetadataServer.sh stop
ADMU0118E: The log file cannot be written to location
           /opt/IBM/WebSphere/AppServer/profiles/default/logs/server1/stopServer.log;
           please specify a different location with -logfile
ADMU0128I: Starting tool with the default profile
ADMU3100I: Reading configuration for server: server1
ADMU3201I: Server stop request issued. Waiting for stop status.
ADMU4000I: Server server1 stop completed.

4. login as root - Start Metadata Server
[root@ETL_Dev_1 bin]# cd /opt/IBM/InformationServer/ASBServer/bin
[root@ETL_Dev_1 bin]# ./MetadataServer.sh run
ADMU0116I: Tool information is being logged in file
           /opt/IBM/WebSphere/AppServer/profiles/default/logs/server1/startServer.log
ADMU0128I: Starting tool with the default profile
ADMU3100I: Reading configuration for server: server1
ADMU3200I: Server launched. Waiting for initialization status.
ADMU3000I: Server server1 open for e-business; process id is 14906

5. login as root (should be dsadm) - Stop ASB Node
[root@ETL_Dev_1 bin]# cd /opt/IBM/InformationServer/ASBNode/bin
[root@ETL_Dev_1 bin]# ./NodeAgents.sh start
6. login as dsadm - Start DS
[dsadm@ETL_Dev_1 script]$ cd $DSHOME
[dsadm@ETL_Dev_1 DSEngine]$ . ./dsenv
[dsadm@ETL_Dev_1 DSEngine]$ ./bin/uv -admin -start

07 May 2010

Change DS logs from xmeta to UV

Symptom: 

Some errors we used to see because of performance of DB logging -
1. DS Director becomes slower and slower.
2. When I tried to log in I got the following message:
ACCESS DENIED: Project currently locked. The project is currently locked by the Administrator. Try again later or see your administrator.
3. We would see the director just freeze up as we try to open the logs.
4. You could not rerun a job for a very long time after it was done.
5. Job is not in a runnable state. This we learnt was because the logs were still being written long after job completed and the server thought the job is still active.

Cause: 
DS logs are kept in xmeta (DB2)

Solution: 
Move logs from xmeta to UV

Step:
1. In DSParams file, change RTLogging=1, ORLogging=0
2. Restart the DataStage
3. Note that all logs in DS Director will be disappeared because of storage changed. (DB2 > UV)