Monday, 23 January 2017

Repository Data Export ,Import using startSQLImport and RepositoryLoader(RL)

Aim:To remove unwanted (versioned data)  at BCC.

Steps:
1)Export the data from CATA/CATB into xml files using startSQLRepository  command.
2)Import those xml files using startSQLImport command  at PUb schema.
3)Import the file assets using RL (explained seperately).
4)Run the full deployment using directSQLDeployment(DeploymentServer).
5)Trigger the CMS at store instance
6) Insert the inventory and price data into Core schema using SQL scripts.
7)Trigger the Endeca index.
8)verify the site.

Repositories to be export and import (OOTB)

The OOTB repositories which are used as part of export and import

@Store

1) /atg/multisite/SiteRepository  (export/imported as part product catalog export/import )
2) /atg/seo/SEORepository
3) /atg/userprofiling/PersonalizationRepository
4) /atg/commerce/catalog/ProductCatalog
5) /atg/commerce/claimable/ClaimableRepository



@PUB

1) /atg/systemconfiguration/SystemConfigurationRepository
2) /atg/dynamo/security/AdminSqlRepository
3) /atg/portal/framework/PortalRepository
4) /atg/userprofiling/InternalProfileRepository
5) /atg/web/viewmapping/ViewMappingRepository
6) /atg/epub/file/PublishingFileRepository

If we have any Custom repositories , we need to export and import those also.

Pre Export configurations

Before exporting, we have to set the following configurations in the ATG-home/servers/cata_export\localconfig\atg\dynamo\service\jdbc  path.

1)DirectJTDataSource.properties
2)FakeXADataSource.properties
3)FakeXADataSource_core.properties
4)IdGenerator.properties
5)JTDataSource.properties
6)SwitchingDataSource.properties


Detailed Explanation
1)DirectJTDataSource.properties
dataSource=/atg/dynamo/service/jdbc/JTDataSource




2)FakeXADataSource.properties

$class=atg.service.jdbc.FakeXADataSource

#driver=oracle.jdbc.xa.client.OracleXADataSource
driver=oracle.jdbc.driver.OracleDriver
URL=jdbc:oracle:thin:@172.21.0.71:1521:DEVOPSDB
user=CATA-schemaUserName
password=CATA-schemaPWD


3)FakeXADataSource_core.properties

$class=atg.service.jdbc.FakeXADataSource
driver=oracle.jdbc.xa.client.OracleXADataSource
URL=jdbc:oracle:thin:@DBIPAddress:1521:ServiceId
user=Core_schemaUserName
password=Core_schemaPWD




4)IdGenerator.properties
dataSource=/atg/dynamo/service/jdbc/SwitchingDataSource

5)JTDataSource.properties
$class=atg.service.jdbc.MonitoredDataSource
dataSource=/atg/dynamo/service/jdbc/FakeXADataSource_core
min=10
max=10
blocking=true
#maxBlockTime=
#maxCreateTime=
#maxCreateAttempts=
loggingSQLWarning=false
loggingSQLInfo=false

6)SwitchingDataSource.properties
$class=atg.service.jdbc.MonitoredDataSource

# only use this data source if you do not have an JDBC driver
# which provides true XA data sources
dataSource=/atg/dynamo/service/jdbc/FakeXADataSource
min=10
max=10
blocking=true

#maxBlockTime=
#maxCreateTime=
#maxCreateAttempts=

#
# This will log any SQLWarnings that are generated.  By default, we turn
# these off since they tend to be informational, not really warnings.  If
# we want the full traceback for where these messages are generated,
# set loggingWarning to true.
#
loggingSQLWarning=false
#
# The monitored connection by default logs all sql through the log info
# path.
#
loggingSQLInfo=false



Pub export configurations

As we have data to export  from PUB repositories, we have to set the following configurations in the path
 ATG-home/serverspub_export\localconfig\atg\dynamo\service\jdbc

1
)FakeXADataSource.properties
2)FakeXADataSource_core.properties
3)FakeXADataSource_pub.propertie
4)IdGenerator.properties
5)IdGenerator_production.propertie
6)5)JTDataSource.properties
7)JTDataSource_production.properties
8)SwitchingDataSource.properties


1)FakeXADataSource.properties

$class=atg.service.jdbc.FakeXADataSource
#driver=oracle.jdbc.xa.client.OracleXADataSource
driver=oracle.jdbc.driver.OracleDriver
URL=jdbc:oracle:thin:@DBHostIpAddrss:1521:SID
user=CATA_schemauserName
password=CATA_schemaPWD

2)FakeXADataSource_core.properties

$class=atg.service.jdbc.FakeXADataSource
driver=oracle.jdbc.xa.client.OracleXADataSource
URL=jdbc:oracle:thin:@DBHostIpAddrss:1521:SID
user=Core_schemauserName
password=Core_schemaPWD

3)FakeXADataSource_pub.propertie

  $class=atg.service.jdbc.FakeXADataSource

#driver=oracle.jdbc.xa.client.OracleXADataSource
driver=oracle.jdbc.driver.OracleDriver
URL=jdbc:oracle:thin:@172.21.22.14:1521:LPSRMS
user=PUB_schemauserName
password=PUB_schemaPWD


4)IdGenerator.properties
dataSource=/atg/dynamo/service/jdbc/SwitchingDataSource

5)IdGenerator_production.propertie
dataSource=/atg/dynamo/service/jdbc/JTDataSource_production

6)5)JTDataSource.properties
  $class=atg.service.jdbc.MonitoredDataSource
dataSource=/atg/dynamo/service/jdbc/FakeXADataSource_pub
min=10
max=10
blocking=true
loggingSQLWarning=false
loggingSQLInfo=false

7)JTDataSource_production.properties

  $class=atg.service.jdbc.MonitoredDataSource
dataSource=/atg/dynamo/service/jdbc/FakeXADataSource_core
min=10
max=10
blocking=true
loggingSQLWarning=false
loggingSQLInfo=false

8)SwitchingDataSource.properties

$class=atg.service.jdbc.MonitoredDataSource
dataSource=/atg/dynamo/service/jdbc/FakeXADataSource
min=10
max=10
blocking=true

#maxBlockTime=
#maxCreateTime=
#maxCreateAttempts=
loggingSQLWarning=false
loggingSQLInfo=false



Pre Import configurations


The import configurations we have to do in the following paths

ATG-home\servers\pub_import\localconfig\atg\dynamo\service\jdbc

1)DirectJTDataSource.properties
2)DirectJTDataSource_production
3)FakeXADataSource.properties
4)FakeXADataSource_core.properties
5)FakeXADataSource_pub.propertie
6)IdGenerator.properties
7)IdGenerator_production.propertie
8)5)JTDataSource.properties
9)JTDataSource_production.properties
10)SwitchingDataSource.properties


1)DirectJTDataSource.properties
$class=atg.service.jdbc.MonitoredDataSource

dataSource=/atg/dynamo/service/jdbc/FakeXADataSource
min=10
max=10
blocking=true
#maxBlockTime=
#maxCreateTime=
#maxCreateAttempts=
loggingSQLWarning=false
loggingSQLInfo=false


2)DirectJTDataSource_production
 
$class=atg.service.jdbc.MonitoredDataSource

dataSource=/atg/dynamo/service/jdbc/FakeXADataSource_core
min=10
max=10
blocking=true
#maxBlockTime=
#maxCreateTime=
#maxCreateAttempts=
loggingSQLWarning=false
loggingSQLInfo=false


3)FakeXADataSource.properties

  $class=atg.service.jdbc.FakeXADataSource

#driver=oracle.jdbc.xa.client.OracleXADataSource
driver=oracle.jdbc.driver.OracleDriver
URL=jdbc:oracle:thin:@DBHostIPAddress:1521:SID
user=CATA_SchemauserName
password=CATA_SchemauserPWD


4)FakeXADataSource_core.properties
$class=atg.service.jdbc.FakeXADataSource

driver=oracle.jdbc.xa.client.OracleXADataSource
URL=jdbc:oracle:thin:@DBHostIPAddress:1521:SID
user=Core_SchemauserName
passwordCore_SchemauserPWD

5)FakeXADataSource_pub.propertie
 $class=atg.service.jdbc.FakeXADataSource

#driver=oracle.jdbc.xa.client.OracleXADataSource
driver=oracle.jdbc.driver.OracleDriver
URL=jdbc:oracle:thin:@DBHostIPAddress:1521:SID
user=PUB_SchemauserName
password=PUB_SchemauserPWD

6)IdGenerator.properties
dataSource=/atg/dynamo/service/jdbc/JTDataSource
7)IdGenerator_production.propertie
  dataSource=/atg/dynamo/service/jdbc/JTDataSource_production

8)5)JTDataSource.properties
  $class=atg.service.jdbc.MonitoredDataSource
dataSource=/atg/dynamo/service/jdbc/FakeXADataSource_pub

min=10
max=500
blocking=true
#maxBlockTime=
#maxCreateTime=
#maxCreateAttempts=
loggingSQLWarning=false
loggingSQLInfo=false


9)JTDataSource_production.properties
  $class=atg.service.jdbc.MonitoredDataSource
dataSource=/atg/dynamo/service/jdbc/FakeXADataSource_core
min=10
max=10
blocking=true

#maxBlockTime=
#maxCreateTime=
#maxCreateAttempts=
loggingSQLWarning=false
loggingSQLInfo=false

10)SwitchingDataSource.properties
 $class=atg.service.jdbc.MonitoredDataSource
dataSource=/atg/dynamo/service/jdbc/FakeXADataSource
min=10
max=10
blocking=true
#maxBlockTime=
#maxCreateTime=
#maxCreateAttempts=
loggingSQLWarning=false
loggingSQLInfo=true

Export


We can export the data using following command at Linux box.

nohup startSQLRepository  -m modules.Commerce -s cata_export -encoding ISO-8859-1 -repository /atg/commerce/catalog/ProductCatalog -noTransaction -export all exported_productCatalog.xml > exported_productCatalog_log.txt &


Explanation:
Export the content at Linux box from the  {ATG_HOME}\bin folder
modules should be there in /opt/ATG /ATG10.2 folder and should contains Commerce folder.
The content will be exported in the folder exported_productCatalog.xml.
The log will be moved to the folder   exported_productCatalog_log.txt


We can export all the versioned and non versioned repositories.Usually productCatalog export will take time and also depends on number of asset and   Linux box configuration like JVM  RAM and DB server configuration like oracle server RAM


Import (startSQLImport )

We can do import the data file(.xml file) in two ways.

1)Import the content without checkin and executed checkin scripts.(Huge data)
2)Import the content and checkin as part of the command.(small data) 
 

1)Import the content without checkin and executed checkin scripts.(Huge data)

 This is used to import the product catalog data  from XML file to PUB schema and command used as follows.

nohup  startSQLImport -m modules.CA -s  pub_import  -repository  /atg/commerce/catalog/ProductCatalog -file  /opt/ATG/ATG10.2/home/bin/CAT-A/exported_productCatalog.xml  -workspace ProductCatalogImport -nocheckin   -batchSize 5000 > imported_productCatalog_log.txt &

 In the above command nocheckin is given , so it won't be checkedin automatically. Need to run DB scripts to check in the workspace (ProductCatalogImport ). And batchsize also helps to make the import fast.

2)Import the content and checkin as part of the command.(small data)

nohup startSQLImport -m modules.CA -s  pub_import  -repository    /atg/commerce/claimable/ClaimableRepository -file  /opt/ATG/ATG10.2/home/bin/exportFilesLocation/exported_claimable_dump.xml  -workspace Claimable_import > /opt/ATG/ATG10.2/home/bin/Logs/imported_Claimable_logo.txt &

In the above command -nocheckin and batchSize is not mentioned as the file  contaisn samll data compared with Product catalog. So workspace is checkedin . NO need to run any scripts to checking the workspace.


1)Should be executed at {ATG_HOME}\bin location.
2)exported_productCatalog.xml is the exported file using startSQLRepository command as mentioned above Export section.
3)We executed in nocheckin mode.
4)Generated log file name is imported_productCatalog_log.txt

 While importing productCatalog, the following error may occurs.


Error:While importing catalog import using SQLImport command, got the following error.

CONTAINER:atg.adapter.gsa.sqlimport.ImportException; SOURCE:com.ctc.wstx.exc.WstxUnexpectedCharException: Illegal character ((CTRL-CHAR, code 26))
at [row,col {unknown-source}]: [28036853,45]
at atg.adapter.gsa.sqlimport.ImportParserImpl.parseFirsPass(ImportParserImpl.java:130)
at atg.adapter.gsa.sqlimport.SQLImporter.doImport(SQLImporter.java:1992)
at atg.adapter.gsa.sqlimport.SQLImporter.execute(SQLImporter.java:1353)
at atg.adapter.gsa.sqlimport.SQLImporter.main(SQLImporter.java:1306)
Caused by: com.ctc.wstx.exc.WstxUnexpectedCharException: Illegal character ((CTRL-CHAR, code 26))
at [row,col {unknown-source}]: [28036853,45]
at com.ctc.wstx.sr.StreamScanner.throwInvalidSpace(StreamScanner.java:666)
at com.ctc.wstx.sr.StreamScanner.throwInvalidSpace(StreamScanner.java:651)
at com.ctc.wstx.sr.BasicStreamReader.readCDataPrimary(BasicStreamReader.java:4226)
at com.ctc.wstx.sr.BasicStreamReader.nextFromTreeCommentOrCData(BasicStreamReader.java:3285)
at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2801)
at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1065)
at org.codehaus.stax2.ri.Stax2EventReaderImpl.nextEvent(Stax2EventReaderImpl.java:255)
at atg.adapter.gsa.sqlimport.ImportParserImpl.parseFirsPass(ImportParserImpl.java:112)
... 3 more

Solution:Create one .sh file  and execute that file which removes cntrl character.

echo 'starting '
tr [:cntrl:] ' ' <productCatalogExportData.xml> productCatalogExportData_withRemovedChars.xml


echo 'ending'
-----

Logs for StartSQLImport which won't checkin the workspace as part of   StartSQLImport  command execution

Example logs while importing the product catalog. 

The following logs  shows which won't check in the project as part of the import.
As data is long, we opted import process without checkin.
[info@2017-01-25 10:24:55.749] Phase 1 of 2: 100%
[info@2017-01-25 10:36:50.757] Phase 2 of 3: 10%

[info@2017-01-25 10:45:38.261] Phase 2 of 3: 20%

[info@2017-01-25 10:54:16.794] Phase 2 of 3: 30%

[info@2017-01-25 11:02:09.893] Phase 2 of 3: 40%

[info@2017-01-25 11:11:05.023] Phase 2 of 3: 50%

[info@2017-01-25 11:19:55.685] Phase 2 of 3: 60%

[info@2017-01-25 11:28:59.399] Phase 2 of 3: 70%

[info@2017-01-25 11:48:35.439] Phase 2 of 3: 80%

[info@2017-01-25 12:11:44.144] Phase 2 of 3: 90%

[info@2017-01-25 12:30:29.815] Phase 2 of 3: 100%

[info@2017-01-25 12:30:29.904] performing add/update/removes phase 3

[info@2017-01-25 12:46:43.728] Phase 3 of 3: 10%

[info@2017-01-25 13:00:20.402] Phase 3 of 3: 20%

[info@2017-01-25 13:14:11.638] imported 2,931,754 items in 18:35:43 hh:mm:ss (excluding checkin)

[info@2017-01-25 13:14:11.656] imported 2,931,754 items at 43.79 items/second

[info@2017-01-25 13:14:11.656] shutting down

.....
.......

Logs for StartSQLImport which checkin the workspace as part of   StartSQLImport  command execution

Other Repository Like SEO and Checking As part of Import iteself

[info@2020-12-05 04:46:48.026] created new workspace: workspace:76101
[info@2020-12-05 04:46:48.034] parsing input data

[info@2020-12-05 04:46:48.323] importing 400 items.

[info@2020-12-05 04:46:48.323] checking out versioned items in batch

[info@2020-12-05 04:46:49.338] performing add/update/removes

[info@2020-12-05 04:46:50.472] Phase 2 of 2: 100%

[info@2020-12-05 04:46:50.472] checkIn: workspace:76101


Manually Workspace checkin (product catalog )


1)After this, we have to import the new file using nohup command. will be imported into the product catalog.
2)But workspace is not checkeed in.
3) We have to manually check in the  product catalog assets .The following are the scripts are used to checkin the workspace. This will be working for the first. If we have custom tables, we need to run for those also.The below are purerly OOTB product catalog.

update  DBC_MANUFACTURER  set IS_HEAD =1 , VERSION_EDITABLE = 0 , CHECKIN_DATE = current_timestamp  ;
update   DCS_CATALOG   set IS_HEAD =1 , VERSION_EDITABLE = 0 , CHECKIN_DATE = current_timestamp ;
update   DCS_CATEGORY   set IS_HEAD =1 , VERSION_EDITABLE = 0 , CHECKIN_DATE = current_timestamp ;
update   DCS_CLOSE_QUALIF   set IS_HEAD =1 , VERSION_EDITABLE = 0 , CHECKIN_DATE = current_timestamp ;
update   DCS_FOLDER   set IS_HEAD =1 , VERSION_EDITABLE = 0 , CHECKIN_DATE = current_timestamp ;
update   DCS_GEN_FOL_CAT   set IS_HEAD =1 , VERSION_EDITABLE = 0 , CHECKIN_DATE = current_timestamp ;
update   DCS_MEDIA   set IS_HEAD =1 , VERSION_EDITABLE = 0 , CHECKIN_DATE = current_timestamp ;
update   DCS_PRICE_LIST   set IS_HEAD =1 , VERSION_EDITABLE = 0 , CHECKIN_DATE = current_timestamp ;
update   DCS_PRM_FOLDER   set IS_HEAD =1 , VERSION_EDITABLE = 0 , CHECKIN_DATE = current_timestamp ;
update   DCS_PROMOTION   set IS_HEAD =1 , VERSION_EDITABLE = 0 , CHECKIN_DATE = current_timestamp ;
update   SITE_CONFIGURATION   set IS_HEAD =1 , VERSION_EDITABLE = 0 , CHECKIN_DATE = current_timestamp ;
update   SITE_GROUP   set IS_HEAD =1 , VERSION_EDITABLE = 0 , CHECKIN_DATE = current_timestamp ;
commit;

update   DCS_PRODUCT    set IS_HEAD =1 , VERSION_EDITABLE = 0 , CHECKIN_DATE = current_timestamp ;
commit;
update   DCS_SKU   set IS_HEAD =1 , VERSION_EDITABLE = 0 , CHECKIN_DATE = current_timestamp ;
commit;
update   DCS_SKU_LINK   set IS_HEAD =1 , VERSION_EDITABLE = 0 , CHECKIN_DATE = current_timestamp ;
commit;

update avm_workspace set checked_in = 1, ci_time = current_timestamp,userid ='admin',locked =0, editable =0, change_was ='SQL import' where ws_id in (select  workspace_id from dcs_catalog where rownum =1 );
commit;

Note:Here we can give directly current_timestamp as value. It will work.

Once all the version data(all repositories ) is imported using  the startSQLImport , data is available  PUB schema. After this We have to do full derployment using (useDirectSQL=true @Deployment Manager) .Then only data will be available at CAT/CATB(If configured CATB).So this startSQLImport  will be used for the initial import only.

Repository loader

Presetup:
Production deployment config file folder copy is required to  load the content config file data which holds the all file assets and and this config file should be accessible to BCC instance

1)This explains the initial import of file assets Like form production.
2)Used to import file assets into BCC where PUB schema is empty

The following components has to be changed.


1)/atg/epub/file/typemappers/PublishingTypeMapper
2)/atg/dynamo/service/loader/FileSystemMonitorServic
3)/atg/epub/file/VersionedLoaderEventListener
 Configuration at the compoents

1)/atg/epub/file/typemappers/PublishingTypeMapper
Implement the custom class and configure here
2)/atg/dynamo/service/loader/FileSystemMonitorServic

rootPath=
  
3)/atg/epub/file/VersionedLoaderEventListener

createProjects=true [“true” only when targets are existing and “false” loading using workspace]

The URL is
http://hostNumber:BCCportNumner\rl












No comments:

Post a Comment