Class DataIO

java.lang.Object
  |
  +--DataIO

public class DataIO
extends java.lang.Object

This class contains methods to read and write array data, GIPO, .quant, and MAE files.

This work was produced by Peter Lemkin of the National Cancer Institute, an agency of the United States Government. As a work of the United States Government there is no associated copyright. It is offered as open source software under the Mozilla Public License (version 1.1) subject to the limitations noted in the accompanying LEGAL file.

Version:
$Date: 2002/10/14 19:22:29 $ $Revision: 1.17 $
Author:
B Stephens, (SAIC), P. Lemkin (NCI), G. Thornwall(SAIC), NCI-Frederick, Frederick, MD
See Also:
MAExplorer Home

Field Summary
private static int affyFieldDescriptionCol
          Special hack for Affy files - see chkAndEditFieldNames().
 Cvt2Mae cvt
          Global links
 Element[] datum
          data elements from the input table
 int expectedNbrTokens
          expected # of tokens for this layout
 CvtGUI gui
          Global GUI popup frame
 int idxClone_ID
          GIPO index for clone ID
 int idxDbEst3
          GIPO index 'dbEST' identifier data.
 int idxDbEst5
          GIPO index 'dbEST' identifier data.
 int idxDetValue
          GIPO index for Spot Detection value
 int idxDiffCall
          GIPO index for Affy Diff Call
 int idxField
          GIPO index
 int idxFoldChange
          GIPO index for Fole Change
 int idxGenBankAcc
          GIPO index 'GenBank' identifier data.
 int idxGenBankAcc3
          GIPO index 'GenBank' identifier data.
 int idxGenBankAcc5
          GIPO index 'GenBank' identifier data.
 int idxGeneName
          GIPO index for gene name
 int idxGrid
          GIPO index for Grid
 int idxGridCol
          GIPO index for grid column
 int idxGridRow
          GIPO index for grid row
 int idxIdentifier
          generic Genomic ID - used when no other ids
 int idxLocation
          GIPO index fields for fieldNames[]
 int idxLocusLinkID
          GIPO index
 int idxNAME_GRC
          GIPO index for NAME_GRC; (Molecular Dynamics "NAME_GRC" spec )
 int idxPlate
          GIPO index for plate
 int idxPlateCol
          GIPO index for plate column
 int idxPlateRow
          GIPO index for plate row
 int idxQualCheck
          GIPO index
 int idxQualCheckGIPO
          GIPO index for 'GIPO' QualCheck data
 int idxRawBackground
          GIPO index
 int idxRawBackground1
          GIPO index
 int idxRawBackground2
          GIPO index
 int idxRawIntensity
          GIPO index
 int idxRawIntensity1
          GIPO index
 int idxRawIntensity2
          GIPO index
 int idxSwissProtID
          GIPO index SwissProtID
 int idxUnigene_cluster_ID
          GIPO index Unigene cluster ID
 int idxUnigene_cluster_name
          GIPO index Unigene cluster name
 int idxX
          Quant index fields are for fieldNames[]
 int idxY
          GIPO index
 int infillCnt
          # of infill spots generated
 MaeConfigData mcd
          Mae Config Data
 SetupLayouts sul
          Global layouts
 java.lang.String title
          GUI title
 java.lang.String[] tokArray
          [MAX_IN_TOKENS_PER_ROW] holds input tokens
 boolean[] useTokFlagGQ
          [MAX_IN_TOKENS_PER_ROW] tokens to get for FieldGQ data
 UtilCM util
          Global utilities
 
Constructor Summary
DataIO(Cvt2Mae cvt)
          DataIO() - Constructor
 
Method Summary
 java.lang.String addGenomicDataFromDescr(java.lang.String sDescr)
          addGenomicDataFromDescr() - setup Affy genomic ids data from Description "Cluster Incl AW208667:uo62e02.x1 Mus musculus cDNA, 3 end /clone=IMAGE-2647130 /clone_end=3 /gb=AW208667 /gi=6514607 /ug=Mm.4609 /len=474" Add field "Clone_ID" from /clone=IMAGE-2647130 Add field "GenBankAcc" from /gb=AW208667 Add field "UniGeneID from /ug=Mm.4609 Add field "LocusID" from /gb=AW208667
 java.lang.String addGenomicFieldsNamesFromDescr()
          addGenomicFieldsNamesFromDescr() - setup genomic ids field names from Description "Cluster Incl AW208667:uo62e02.x1 Mus musculus cDNA, 3 end /clone=IMAGE-2647130 /clone_end=3 /gb=AW208667 /gi=6514607 /ug=Mm.4609 /len=474" Add field "Clone_ID" from /clone=IMAGE-2647130 Add field "GenBankAcc" from /gb=AW208667 Add field "UniGeneID from /ug=Mm.4609 Add field "LocusID" from /gb=AW208667
private  boolean checkIfRequiredFieldsAreNull(boolean[] useTokFlagGQ, java.lang.String[] tokArray)
          checkIfRequiredFieldsAreNull() - test if the Required fields are "" Return true if the required fields are ""
 boolean chkAndEditFieldNames(java.lang.String[] tokArray, java.lang.String lookForOpr)
          chkAndEditFieldNames() - check and edit Sample or Field names.
 java.lang.String cvtGipoDatumToTabDelimStr(Element d)
          cvtGipoDatumToTabDelimStr() - convert Element to tab-delimited GIPO string
 java.lang.String cvtQuantDatumToTabDelimStr(Element d)
          cvtQuantDatumToTabDelimStr() - convert Element to tab-delimited Quant string
private  java.lang.String cvtUserDataToQualCheck(java.lang.String qcData)
          cvtUserDataToQualCheck() map user data range to valid QualCheck range
private  int extractDatumFromDataRow(java.lang.String[] tokArray, int rowNbr, int spotNbr, java.lang.String line, int grid, int gRow, int gCol)
          extractDatumFromDataRow() - extract datum[spotID] from row of token data Return spotID # (Location#) if ok else -1
 boolean genMultIdxMapOfFieldNameData(java.lang.String[] fieldNamesGQ, int nDataFields, int n)
          genMultIdxMapOfFieldNameData() - generate quant idx map of data from field map and sames of input file fields.
[1] Generate a map of fields where Samples starts to the indices for each .quant sample to be generated.
[2] Generate useTokFlag[] from fieldNames[] and FieldMap data.
 java.lang.String makeGipoFieldsTabDelimStr()
          makeGipoFieldsTabDelimStr() - make tab-delimited GIPO Fields string
 java.lang.String makeQuantFieldsTabDelimStr()
          makeQuantFieldsTabDelimStr() - make tab-delimited Quant Fields string
(package private)  void quickSortByIntLoc(Element[] d, int lo0, int hi0)
          quickSortByIntLoc() - recursive sort the Element list datum[].iLocation.
(package private)  void quickSortByLocStr(Element[] d, int lo0, int hi0)
          quickSortByLocStr() - sort the Element list datum[].location.
 boolean readData(int n)
          readData() - read composite vendor or user data from file.
 void setupFieldNameIndices()
           
 void setupFilenames()
          setupFilenames() - setup proper file names based on path analysis These are kept in the mcd.XXXX state.
private  boolean updateInfillLocationIDs()
          updateInfillLocationIDs() - if using Location IDs, then sort datum[1:mcd.maxRowsExpected] by Location add infill spots and renumber grid coordinates.
private  boolean writeConfigFile(MaeConfigData mcd)
          writeConfigFile() - create Config/MaExplorerConfig-fn.txt file
 boolean writeGipoData()
          writeGipoData() - write out the MAExplorer GIPO data file xxxx.gipo
private  boolean writeMAEstartupFile()
          writeMAEstartupFile() - create MAE/Start.mae file
 boolean writeQuantData(int n)
          writeQuantData() - write out the MAExplorer Quant data files These include the set of xxxxnnn.quant files
 
Methods inherited from class java.lang.Object
, clone, equals, finalize, getClass, hashCode, notify, notifyAll, registerNatives, toString, wait, wait, wait
 

Field Detail

cvt

public Cvt2Mae cvt
Global links

mcd

public MaeConfigData mcd
Mae Config Data

util

public UtilCM util
Global utilities

sul

public SetupLayouts sul
Global layouts

gui

public CvtGUI gui
Global GUI popup frame

title

public java.lang.String title
GUI title

tokArray

public java.lang.String[] tokArray
[MAX_IN_TOKENS_PER_ROW] holds input tokens

useTokFlagGQ

public boolean[] useTokFlagGQ
[MAX_IN_TOKENS_PER_ROW] tokens to get for FieldGQ data

datum

public Element[] datum
data elements from the input table

infillCnt

public int infillCnt
# of infill spots generated

expectedNbrTokens

public int expectedNbrTokens
expected # of tokens for this layout

idxLocation

public int idxLocation
GIPO index fields for fieldNames[]

idxField

public int idxField
GIPO index

idxGrid

public int idxGrid
GIPO index for Grid

idxGridRow

public int idxGridRow
GIPO index for grid row

idxGridCol

public int idxGridCol
GIPO index for grid column

idxNAME_GRC

public int idxNAME_GRC
GIPO index for NAME_GRC; (Molecular Dynamics "NAME_GRC" spec )

idxPlate

public int idxPlate
GIPO index for plate

idxPlateRow

public int idxPlateRow
GIPO index for plate row

idxPlateCol

public int idxPlateCol
GIPO index for plate column

idxQualCheckGIPO

public int idxQualCheckGIPO
GIPO index for 'GIPO' QualCheck data

idxIdentifier

public int idxIdentifier
generic Genomic ID - used when no other ids

idxClone_ID

public int idxClone_ID
GIPO index for clone ID

idxGeneName

public int idxGeneName
GIPO index for gene name

idxUnigene_cluster_ID

public int idxUnigene_cluster_ID
GIPO index Unigene cluster ID

idxUnigene_cluster_name

public int idxUnigene_cluster_name
GIPO index Unigene cluster name

idxGenBankAcc

public int idxGenBankAcc
GIPO index 'GenBank' identifier data. See http://ncbi.nlm.nih.gov/

idxGenBankAcc3

public int idxGenBankAcc3
GIPO index 'GenBank' identifier data. See http://ncbi.nlm.nih.gov/

idxGenBankAcc5

public int idxGenBankAcc5
GIPO index 'GenBank' identifier data. See http://ncbi.nlm.nih.gov/

idxDbEst3

public int idxDbEst3
GIPO index 'dbEST' identifier data.

idxDbEst5

public int idxDbEst5
GIPO index 'dbEST' identifier data.

idxSwissProtID

public int idxSwissProtID
GIPO index SwissProtID

idxLocusLinkID

public int idxLocusLinkID
GIPO index

idxX

public int idxX
Quant index fields are for fieldNames[]

idxY

public int idxY
GIPO index

idxRawIntensity

public int idxRawIntensity
GIPO index

idxRawIntensity1

public int idxRawIntensity1
GIPO index

idxRawIntensity2

public int idxRawIntensity2
GIPO index

idxRawBackground

public int idxRawBackground
GIPO index

idxRawBackground1

public int idxRawBackground1
GIPO index

idxRawBackground2

public int idxRawBackground2
GIPO index

idxQualCheck

public int idxQualCheck
GIPO index

idxDetValue

public int idxDetValue
GIPO index for Spot Detection value

idxDiffCall

public int idxDiffCall
GIPO index for Affy Diff Call

idxFoldChange

public int idxFoldChange
GIPO index for Fole Change

affyFieldDescriptionCol

private static int affyFieldDescriptionCol
Special hack for Affy files - see chkAndEditFieldNames(). This will contain the column for the Field "Description" if needed, else -1.
Constructor Detail

DataIO

public DataIO(Cvt2Mae cvt)
DataIO() - Constructor
Parameters:
cvt - Cvt2Mae instance
Method Detail

extractDatumFromDataRow

private int extractDatumFromDataRow(java.lang.String[] tokArray,
                                    int rowNbr,
                                    int spotNbr,
                                    java.lang.String line,
                                    int grid,
                                    int gRow,
                                    int gCol)
extractDatumFromDataRow() - extract datum[spotID] from row of token data Return spotID # (Location#) if ok else -1
Parameters:
tokArray - tokens in row
rowNbr - row number in file
spotNbr - sequential spot number
line - raw line
grid - computed Grid
gRow - computed Row
gCol - computed Column
Returns:
SpotID number location if ok, else -1
See Also:
Element

setupFilenames

public void setupFilenames()
setupFilenames() - setup proper file names based on path analysis These are kept in the mcd.XXXX state.

setupFieldNameIndices

public void setupFieldNameIndices()

readData

public boolean readData(int n)
readData() - read composite vendor or user data from file. This assumes that the current state is setup in the MCD config data structure.
Parameters:
n - is the nth user data file to read
Returns:
true if successful
See Also:
Element, FieldMap.genUseTokFlags(java.lang.String[]), FileTable, FileTable.readTableFieldsFromFile(java.lang.String, int), ParseTable, ParseTable.getAllDelimTokens(java.lang.String, java.lang.String[], boolean), TextFrame.appendLog(java.lang.String), UtilCM.logMsg(java.lang.String, java.awt.Color), checkIfRequiredFieldsAreNull(boolean[], java.lang.String[]), chkAndEditFieldNames(java.lang.String[], java.lang.String), extractDatumFromDataRow(java.lang.String[], int, int, java.lang.String, int, int, int), genMultIdxMapOfFieldNameData(java.lang.String[], int, int), updateInfillLocationIDs()

checkIfRequiredFieldsAreNull

private boolean checkIfRequiredFieldsAreNull(boolean[] useTokFlagGQ,
                                             java.lang.String[] tokArray)
checkIfRequiredFieldsAreNull() - test if the Required fields are "" Return true if the required fields are ""
Parameters:
useTokFlagGQ - boolean array
tokArray - String array
Returns:
true if required fields are null, false if required fields are not null

genMultIdxMapOfFieldNameData

public boolean genMultIdxMapOfFieldNameData(java.lang.String[] fieldNamesGQ,
                                            int nDataFields,
                                            int n)
genMultIdxMapOfFieldNameData() - generate quant idx map of data from field map and sames of input file fields.
[1] Generate a map of fields where Samples starts to the indices for each .quant sample to be generated.
[2] Generate useTokFlag[] from fieldNames[] and FieldMap data. This analyzes the field map and extracts only the data required.
[3] Setup lookup FieldName indices for mapping data. Note: for multiple samples/file this will remap things like rawIntensity by flaging other fields.
Return true if succeed.
Parameters:
fieldNamesGQ - array of field names
nDataFields - max number of data fields
n - nth data file from 0
Returns:
true if successful
See Also:
FieldMap.genUseTokFlags(java.lang.String[]), FieldMap.lookupUserIndex(java.lang.String, java.lang.String), UtilCM.logMsg(java.lang.String, java.awt.Color), setupFieldNameIndices()

chkAndEditFieldNames

public boolean chkAndEditFieldNames(java.lang.String[] tokArray,
                                    java.lang.String lookForOpr)
chkAndEditFieldNames() - check and edit Sample or Field names. This does special handling required for various types of data. Change the names in the tokArray[]. Return true if made a change.
Parameters:
tokArray - array of tokens to search through
lookForOpr - look for this
Returns:
true if a change was made to the field

updateInfillLocationIDs

private boolean updateInfillLocationIDs()
updateInfillLocationIDs() - if using Location IDs, then sort datum[1:mcd.maxRowsExpected] by Location add infill spots and renumber grid coordinates. This may grow it to datum[1:mcd.maxRowsComputed] where mcd.maxRowsComputed= mcd.highestID or spotNbr if no "Location".
Returns:
true if updated
See Also:
Element, PseudoArray, UtilCM.logMsg(java.lang.String, java.awt.Color), UtilCM.logMsg2(java.lang.String, java.awt.Color), UtilCM.logMsg3(java.lang.String, java.awt.Color)

quickSortByIntLoc

void quickSortByIntLoc(Element[] d,
                       int lo0,
                       int hi0)
quickSortByIntLoc() - recursive sort the Element list datum[].iLocation. Based on the QuickSort method by James Gosling from Sun's SortDemo applet
Parameters:
d - array of Elements to sort
lo0 - low index
hi0 - high index

quickSortByLocStr

void quickSortByLocStr(Element[] d,
                       int lo0,
                       int hi0)
quickSortByLocStr() - sort the Element list datum[].location. Note: d.location is a String and may not be a numeric string. Based on the QuickSort method by James Gosling from Sun's SortDemo applet
Parameters:
d - array of Elements to sort
lo0 - low index
hi0 - high index

addGenomicFieldsNamesFromDescr

public java.lang.String addGenomicFieldsNamesFromDescr()
addGenomicFieldsNamesFromDescr() - setup genomic ids field names from Description "Cluster Incl AW208667:uo62e02.x1 Mus musculus cDNA, 3 end /clone=IMAGE-2647130 /clone_end=3 /gb=AW208667 /gi=6514607 /ug=Mm.4609 /len=474" Add field "Clone_ID" from /clone=IMAGE-2647130 Add field "GenBankAcc" from /gb=AW208667 Add field "UniGeneID from /ug=Mm.4609 Add field "LocusID" from /gb=AW208667
Returns:
field names based on Description

addGenomicDataFromDescr

public java.lang.String addGenomicDataFromDescr(java.lang.String sDescr)
addGenomicDataFromDescr() - setup Affy genomic ids data from Description "Cluster Incl AW208667:uo62e02.x1 Mus musculus cDNA, 3 end /clone=IMAGE-2647130 /clone_end=3 /gb=AW208667 /gi=6514607 /ug=Mm.4609 /len=474" Add field "Clone_ID" from /clone=IMAGE-2647130 Add field "GenBankAcc" from /gb=AW208667 Add field "UniGeneID from /ug=Mm.4609 Add field "LocusID" from /gb=AW208667
Parameters:
sDescr - Description
Returns:
field names based on Description

writeGipoData

public boolean writeGipoData()
writeGipoData() - write out the MAExplorer GIPO data file xxxx.gipo
Returns:
true if successful
See Also:
CvtGUI.getChipsetStr(), CvtGUI.getProjectStr(), Element, TextFrame.appendLog(java.lang.String)

writeQuantData

public boolean writeQuantData(int n)
writeQuantData() - write out the MAExplorer Quant data files These include the set of xxxxnnn.quant files
Parameters:
n - is the nth quant file to write
Returns:
true if successful
See Also:
CvtGUI.getChipsetStr(), CvtGUI.getProjectStr(), TextFrame.appendLog(java.lang.String), UtilCM.logMsg(java.lang.String, java.awt.Color), UtilCM.logMsg2(java.lang.String, java.awt.Color), UtilCM.logMsg3(java.lang.String, java.awt.Color), cvtQuantDatumToTabDelimStr(Element), makeQuantFieldsTabDelimStr()

makeGipoFieldsTabDelimStr

public java.lang.String makeGipoFieldsTabDelimStr()
makeGipoFieldsTabDelimStr() - make tab-delimited GIPO Fields string
Returns:
tab-delimited GIPO Fields string
See Also:
addGenomicFieldsNamesFromDescr()

makeQuantFieldsTabDelimStr

public java.lang.String makeQuantFieldsTabDelimStr()
makeQuantFieldsTabDelimStr() - make tab-delimited Quant Fields string
Returns:
tab-delimited Quant Fields string

cvtUserDataToQualCheck

private java.lang.String cvtUserDataToQualCheck(java.lang.String qcData)
cvtUserDataToQualCheck() map user data range to valid QualCheck range
Parameters:
qcData - QualCheck data
Returns:
qcData, else if GenePix data then bad spot data

cvtGipoDatumToTabDelimStr

public java.lang.String cvtGipoDatumToTabDelimStr(Element d)
cvtGipoDatumToTabDelimStr() - convert Element to tab-delimited GIPO string
Parameters:
d - Element to convert
Returns:
tab-delimited GIPO string
See Also:
addGenomicDataFromDescr(java.lang.String), cvtUserDataToQualCheck(java.lang.String)

cvtQuantDatumToTabDelimStr

public java.lang.String cvtQuantDatumToTabDelimStr(Element d)
cvtQuantDatumToTabDelimStr() - convert Element to tab-delimited Quant string

writeMAEstartupFile

private boolean writeMAEstartupFile()
writeMAEstartupFile() - create MAE/Start.mae file
Returns:
true if successful, false if error
See Also:
MaeStartupData.writeMAEstartupFile()

writeConfigFile

private boolean writeConfigFile(MaeConfigData mcd)
writeConfigFile() - create Config/MaExplorerConfig-fn.txt file
Returns:
true if successful, false if error
See Also:
MaeConfigData.writeConfigFile()