libStatGen Software 1
Loading...
Searching...
No Matches
SamQuerySeqWithRefIter Class Reference

Iterates through the query and compare with reference. More...

#include <SamQuerySeqWithRefHelper.h>

Public Member Functions

 SamQuerySeqWithRefIter (SamRecord &record, GenomeSequence &refSequence, bool forward=true)
 
bool reset (bool forward=true)
 Reset to start at the beginning of the record.
 
bool getNextMatchMismatch (SamSingleBaseMatchInfo &matchMismatchInfo)
 Returns information for the next position where the query and the reference match or mismatch.
 

Detailed Description

Iterates through the query and compare with reference.

NOTE: References to the GenomeSequence and SamRecord are stored, the objects are not copied, so they must remain valid as long as this class is used.

Definition at line 58 of file SamQuerySeqWithRefHelper.h.

Constructor & Destructor Documentation

◆ SamQuerySeqWithRefIter()

SamQuerySeqWithRefIter::SamQuerySeqWithRefIter ( SamRecord record,
GenomeSequence refSequence,
bool  forward = true 
)

Definition at line 24 of file SamQuerySeqWithRefHelper.cpp.

27 : myRecord(record),
28 myRefSequence(refSequence),
29 myCigar(NULL),
30 myStartOfReadOnRefIndex(INVALID_GENOME_INDEX),
31 myQueryIndex(0),
32 myForward(forward)
33{
34 myCigar = myRecord.getCigarInfo();
35 myStartOfReadOnRefIndex =
36 refSequence.getGenomePosition(myRecord.getReferenceName());
37
38 if(myStartOfReadOnRefIndex != INVALID_GENOME_INDEX)
39 {
40 // This reference name was found in the reference file, so
41 // add the start position.
42 myStartOfReadOnRefIndex += myRecord.get0BasedPosition();
43 }
44
45 if(!forward)
46 {
47 myQueryIndex = myRecord.getReadLength() - 1;
48 }
49}
genomeIndex_t getGenomePosition(const char *chromosomeName, unsigned int chromosomeIndex) const
given a chromosome name and position, return the genome position
const char * getReferenceName()
Get the reference sequence name (RNAME) of the record.
Cigar * getCigarInfo()
Returns a pointer to the Cigar object associated with this record.
int32_t getReadLength()
Get the length of the read.
int32_t get0BasedPosition()
Get the 0-based(BAM) leftmost position of the record.

◆ ~SamQuerySeqWithRefIter()

SamQuerySeqWithRefIter::~SamQuerySeqWithRefIter ( )
virtual

Definition at line 52 of file SamQuerySeqWithRefHelper.cpp.

53{
54}

Member Function Documentation

◆ getNextMatchMismatch()

bool SamQuerySeqWithRefIter::getNextMatchMismatch ( SamSingleBaseMatchInfo matchMismatchInfo)

Returns information for the next position where the query and the reference match or mismatch.

To be a match or mismatch, both the query and reference must have a base that is not 'N'. This means: insertions and deletions are not mismatches or matches. 'N' bases are not matches or mismatches

Parameters
matchMismatchInforeturn parameter with the information about the matching/mismatching base.
Returns
true if there was another match/mismatch (matchMismatchInfo was set); false if not.

Definition at line 102 of file SamQuerySeqWithRefHelper.cpp.

103{
104 // Check whether or not this read is mapped.
105 // If the read is not mapped, return no matches.
106 if(!SamFlag::isMapped(myRecord.getFlag()))
107 {
108 // Not mapped.
109 return(false);
110 }
111
112 // Check that the Cigar is set.
113 if(myCigar == NULL)
114 {
115 // Error.
116 throw(std::runtime_error("Cannot determine matches/mismatches since failed to retrieve the cigar"));
117 return(false);
118 }
119
120 // If myStartOfReadOnRefIndex is the default (unset) value, then
121 // the reference was not found, so return false, no matches/mismatches.
122 if(myStartOfReadOnRefIndex == INVALID_GENOME_INDEX)
123 {
124 // This reference name was not found in the reference file, so just
125 // return no matches/mismatches.
126 return(false);
127 }
128
129
130 // Repull the read length from the record to check just in case the
131 // record has changed length.
132 // Loop until a match or mismatch is found as long as query index
133 // is still within the read (Loop is broken by a return).
134 while((myQueryIndex < myRecord.getReadLength()) &&
135 (myQueryIndex >= 0))
136 {
137 // Still more bases, look for a match/mismatch.
138
139 // Get the reference offset for this read position.
140 int32_t refOffset = myCigar->getRefOffset(myQueryIndex);
141 if(refOffset == Cigar::INDEX_NA)
142 {
143 // This is either a softclip or an insertion
144 // which do not count as a match or a mismatch, so
145 // go to the next index.
146 nextIndex();
147 continue;
148 }
149
150 // Both the reference and the read have a base, so get the bases.
151 char readBase = myRecord.getSequence(myQueryIndex, SamRecord::NONE);
152 char refBase = myRefSequence[myStartOfReadOnRefIndex + refOffset];
153
154 // If either the read or the reference bases are unknown, then
155 // it does not count as a match or a mismatch.
156 if(BaseUtilities::isAmbiguous(readBase) ||
158 {
159 // Either the reference base or the read base are unknown,
160 // so skip this position.
161 nextIndex();
162 continue;
163 }
164
165 // Both the read & the reference have a known base, so it is either
166 // a match or a mismatch.
167 matchMismatchInfo.setQueryIndex(myQueryIndex);
168
169 // Check if they are equal.
170 if(BaseUtilities::areEqual(readBase, refBase))
171 {
172 // Match.
173 matchMismatchInfo.setType(SamSingleBaseMatchInfo::MATCH);
174 // Increment the query index to the next position.
175 nextIndex();
176 return(true);
177 }
178 else
179 {
180 // Mismatch
181 matchMismatchInfo.setType(SamSingleBaseMatchInfo::MISMATCH);
182 // Increment the query index to the next position.
183 nextIndex();
184 return(true);
185 }
186 }
187
188 // No matches or mismatches were found, so return false.
189 return(false);
190}
static bool isAmbiguous(char base)
Returns whether or not the specified bases is an indicator for ambiguity.
static bool areEqual(char base1, char base2)
Returns whether or not two bases are equal (case insensitive), if one of the bases is '=',...
static const int32_t INDEX_NA
Value associated with an index that is not applicable/does not exist, used for converting between que...
Definition Cigar.h:492
int32_t getRefOffset(int32_t queryIndex)
Return the reference offset associated with the specified query index or INDEX_NA based on this cigar...
Definition Cigar.cpp:187
@ NONE
Leave the sequence as is.
Definition SamRecord.h:58
uint16_t getFlag()
Get the flag (FLAG).
const char * getSequence()
Returns the SAM formatted sequence string (SEQ), translating the base as specified by setSequenceTran...
void setQueryIndex(int32_t queryIndex)
Set the query index for this object.
void setType(Type newType)
Set the type (match/mismatch/unkown) for this object.

References BaseUtilities::areEqual(), SamRecord::getFlag(), SamRecord::getReadLength(), Cigar::getRefOffset(), SamRecord::getSequence(), Cigar::INDEX_NA, BaseUtilities::isAmbiguous(), SamRecord::NONE, SamSingleBaseMatchInfo::setQueryIndex(), and SamSingleBaseMatchInfo::setType().

Referenced by SamFilter::clipOnMismatchThreshold(), and SamFilter::sumMismatchQuality().

◆ reset()

bool SamQuerySeqWithRefIter::reset ( bool  forward = true)

Reset to start at the beginning of the record.

This will re-read values from SamRecord, so can be used if it has changed to contain information for a new record.

Parameters
forwardtrue means to start from the beginning and go to the end; false means to start from the end and go to the beginning.
Returns
true if successfully reset; false if failed to read the Cigar.

Definition at line 58 of file SamQuerySeqWithRefHelper.cpp.

59{
60 myCigar = myRecord.getCigarInfo();
61 if(myCigar == NULL)
62 {
63 // Failed to get Cigar.
64 return(false);
65 }
66
67 // Get where the position of where this read starts as mapped to the
68 // reference.
69 myStartOfReadOnRefIndex =
70 myRefSequence.getGenomePosition(myRecord.getReferenceName());
71
72 if(myStartOfReadOnRefIndex != INVALID_GENOME_INDEX)
73 {
74 // This reference name was found in the reference file, so
75 // add the start position.
76 myStartOfReadOnRefIndex += myRecord.get0BasedPosition();
77 }
78
79 myForward = forward;
80
81 if(myForward)
82 {
83 myQueryIndex = 0;
84 }
85 else
86 {
87 // reverse, so start at the last entry.
88 myQueryIndex = myRecord.getReadLength() - 1;
89 }
90 return(true);
91}

References SamRecord::get0BasedPosition(), SamRecord::getCigarInfo(), GenomeSequence::getGenomePosition(), SamRecord::getReadLength(), and SamRecord::getReferenceName().


The documentation for this class was generated from the following files: