SphinxBase 0.6
cont_ad_t Struct Reference

Continuous listening module or object Continuous listening module or object. More...

#include <cont_ad.h>

Data Fields

int32(* adfunc )(ad_rec_t *ad, int16 *buf, int32 max)
 
ad_rec_tad
 A/D device argument for adfunc.
 
int32 rawmode
 Pass all input data through, without filtering silence.
 
int16 * adbuf
 Circular buffer for maintaining A/D data read until consumed.
 
int32 state
 State of data returned by most recent cont_ad_read call; CONT_AD_STATE_SIL or CONT_AD_STATE_SPEECH.
 
int32 read_ts
 Absolute timestamp (total no.
 
int32 seglen
 Total no.
 
int32 siglvl
 Max signal level for the data consumed by the most recent cont_ad_read call (dB range: 0-99).
 
int32 sps
 Samples/sec; moved from ad->sps to break dependence on ad by N.
 
int32 eof
 Whether the source ad device has encountered EOF.
 
int32 spf
 Samples/frame; audio level is analyzed within frames.
 
int32 adbufsize
 Buffer size (Number of samples)
 
int32 prev_sample
 For pre-emphasis filter.
 
int32 headfrm
 Frame number in adbuf with unconsumed A/D data.
 
int32 n_frm
 Number of complete frames of unconsumed A/D data in adbuf.
 
int32 n_sample
 Number of samples of unconsumed data in adbuf.
 
int32 tot_frm
 Total number of frames of A/D data read, including consumed ones.
 
int32 noise_level
 PWP: what we claim as the "current" noise level.
 
int32 * pow_hist
 Histogram of frame power, moving window, decayed.
 
char * frm_pow
 Frame power.
 
int32 auto_thresh
 Do automatic threshold adjustment or not.
 
int32 delta_sil
 Max silence power/frame ABOVE noise level.
 
int32 delta_speech
 Min speech power/frame ABOVE noise level.
 
int32 min_noise
 noise lower than this we ignore
 
int32 max_noise
 noise higher than this signals an error
 
int32 winsize
 how many frames to look at for speech det
 
int32 speech_onset
 start speech on >= these many frames out of winsize, of >= delta_speech
 
int32 sil_onset
 end speech on >= these many frames out of winsize, of <= delta_sil
 
int32 leader
 pad beggining of speech with this many extra frms
 
int32 trailer
 pad end of speech with this many extra frms
 
int32 thresh_speech
 Frame considered to be speech if power >= thresh_speech (for transitioning from SILENCE to SPEECH state)
 
int32 thresh_sil
 Frame considered to be silence if power <= thresh_sil (for transitioning from SPEECH to SILENCE state)
 
int32 thresh_update
 Number of frames before next update to pow_hist/thresholds.
 
float32 adapt_rate
 Linear interpolation constant for rate at which noise level adapted to each estimate; range: 0-1; 0=> no adaptation, 1=> instant adaptation.
 
int32 tail_state
 State at the end of its internal buffer (internal use): CONT_AD_STATE_SIL or CONT_AD_STATE_SPEECH.
 
int32 win_startfrm
 Where next analysis window begins.
 
int32 win_validfrm
 Number of frames currently available from win_startfrm for analysis.
 
int32 n_other
 If in SILENCE state, number of frames in analysis window considered to be speech; otherwise number of frames considered to be silence.
 
spseg_tspseg_head
 First of unconsumed speech segments.
 
spseg_tspseg_tail
 Last of unconsumed speech segments.
 
FILE * rawfp
 If non-NULL, raw audio input data processed by cont_ad is dumped to this file.
 
FILE * logfp
 If non-NULL, write detailed logs of this object's progress to the file.
 
int32 n_calib_frame
 Number of frames of calibration data seen so far.
 

Detailed Description

Continuous listening module or object Continuous listening module or object.

An application can open and maintain several such objects, if necessary. FYI: Module always in one of two states: SILENCE or SPEECH. Transitions between the two detected by sliding a window spanning several frames and looking for some minimum number of frames of the other type.

Definition at line 151 of file cont_ad.h.

Field Documentation

◆ ad

ad_rec_t* cont_ad_t::ad

A/D device argument for adfunc.

Also, ad->sps used to determine frame size (spf, see below)

Definition at line 154 of file cont_ad.h.

Referenced by cont_ad_calib().

◆ adapt_rate

float32 cont_ad_t::adapt_rate

Linear interpolation constant for rate at which noise level adapted to each estimate; range: 0-1; 0=> no adaptation, 1=> instant adaptation.

Definition at line 213 of file cont_ad.h.

Referenced by cont_ad_get_params(), and cont_ad_set_params().

◆ adbuf

int16* cont_ad_t::adbuf

Circular buffer for maintaining A/D data read until consumed.

Definition at line 158 of file cont_ad.h.

Referenced by cont_ad_calib().

◆ adbufsize

int32 cont_ad_t::adbufsize

Buffer size (Number of samples)

Definition at line 186 of file cont_ad.h.

◆ adfunc

int32(* cont_ad_t::adfunc) (ad_rec_t *ad, int16 *buf, int32 max)

Definition at line 153 of file cont_ad.h.

◆ auto_thresh

int32 cont_ad_t::auto_thresh

Do automatic threshold adjustment or not.

Definition at line 197 of file cont_ad.h.

◆ delta_sil

int32 cont_ad_t::delta_sil

Max silence power/frame ABOVE noise level.

Definition at line 198 of file cont_ad.h.

Referenced by cont_ad_get_params(), and cont_ad_set_params().

◆ delta_speech

int32 cont_ad_t::delta_speech

Min speech power/frame ABOVE noise level.

Definition at line 199 of file cont_ad.h.

Referenced by cont_ad_get_params(), and cont_ad_set_params().

◆ eof

int32 cont_ad_t::eof

Whether the source ad device has encountered EOF.

Definition at line 183 of file cont_ad.h.

Referenced by cont_ad_read().

◆ frm_pow

char* cont_ad_t::frm_pow

Frame power.

Definition at line 195 of file cont_ad.h.

◆ headfrm

int32 cont_ad_t::headfrm

Frame number in adbuf with unconsumed A/D data.

Definition at line 188 of file cont_ad.h.

Referenced by cont_ad_calib(), cont_ad_read(), and cont_ad_reset().

◆ leader

int32 cont_ad_t::leader

pad beggining of speech with this many extra frms

Definition at line 205 of file cont_ad.h.

Referenced by cont_ad_get_params(), cont_ad_read(), and cont_ad_set_params().

◆ logfp

FILE* cont_ad_t::logfp

If non-NULL, write detailed logs of this object's progress to the file.

Controlled by user application via cont_ad_set_logfp(). NULL when cont_ad object is initially created.

Definition at line 231 of file cont_ad.h.

Referenced by cont_ad_read(), and cont_ad_set_logfp().

◆ max_noise

int32 cont_ad_t::max_noise

noise higher than this signals an error

Definition at line 201 of file cont_ad.h.

Referenced by cont_ad_get_params(), and cont_ad_set_params().

◆ min_noise

int32 cont_ad_t::min_noise

noise lower than this we ignore

Definition at line 200 of file cont_ad.h.

Referenced by cont_ad_get_params(), and cont_ad_set_params().

◆ n_calib_frame

int32 cont_ad_t::n_calib_frame

Number of frames of calibration data seen so far.

Definition at line 236 of file cont_ad.h.

Referenced by cont_ad_calib().

◆ n_frm

int32 cont_ad_t::n_frm

Number of complete frames of unconsumed A/D data in adbuf.

Definition at line 189 of file cont_ad.h.

Referenced by cont_ad_calib(), cont_ad_read(), and cont_ad_reset().

◆ n_other

int32 cont_ad_t::n_other

If in SILENCE state, number of frames in analysis window considered to be speech; otherwise number of frames considered to be silence.

Definition at line 222 of file cont_ad.h.

Referenced by cont_ad_read(), and cont_ad_reset().

◆ n_sample

int32 cont_ad_t::n_sample

Number of samples of unconsumed data in adbuf.

Definition at line 190 of file cont_ad.h.

Referenced by cont_ad_read(), and cont_ad_reset().

◆ noise_level

int32 cont_ad_t::noise_level

PWP: what we claim as the "current" noise level.

Definition at line 192 of file cont_ad.h.

◆ pow_hist

int32* cont_ad_t::pow_hist

Histogram of frame power, moving window, decayed.

Definition at line 194 of file cont_ad.h.

Referenced by cont_ad_calib(), and cont_ad_powhist_dump().

◆ prev_sample

int32 cont_ad_t::prev_sample

For pre-emphasis filter.

Definition at line 187 of file cont_ad.h.

◆ rawfp

FILE* cont_ad_t::rawfp

If non-NULL, raw audio input data processed by cont_ad is dumped to this file.

Controlled by user application via cont_ad_set_rawfp(). NULL when cont_ad object is initially created.

Definition at line 227 of file cont_ad.h.

Referenced by cont_ad_set_rawfp().

◆ rawmode

int32 cont_ad_t::rawmode

Pass all input data through, without filtering silence.

Definition at line 156 of file cont_ad.h.

Referenced by cont_ad_read().

◆ read_ts

int32 cont_ad_t::read_ts

Absolute timestamp (total no.

of raw samples consumed upto the most recent cont_ad_read call, starting from the very beginning). Note that this is a 32-bit integer; applications should guard against overflow.

Definition at line 167 of file cont_ad.h.

Referenced by cont_ad_read().

◆ seglen

int32 cont_ad_t::seglen

Total no.

of raw samples consumed in the segment returned by the most recent cont_ad_read call. Can be used to detect silence segments that have stretched long enough to terminate an utterance

Definition at line 171 of file cont_ad.h.

Referenced by cont_ad_read().

◆ siglvl

int32 cont_ad_t::siglvl

Max signal level for the data consumed by the most recent cont_ad_read call (dB range: 0-99).

Can be used to update a V-U meter, for example.

Definition at line 175 of file cont_ad.h.

Referenced by cont_ad_read().

◆ sil_onset

int32 cont_ad_t::sil_onset

end speech on >= these many frames out of winsize, of <= delta_sil

Definition at line 204 of file cont_ad.h.

Referenced by cont_ad_get_params(), and cont_ad_set_params().

◆ speech_onset

int32 cont_ad_t::speech_onset

start speech on >= these many frames out of winsize, of >= delta_speech

Definition at line 203 of file cont_ad.h.

Referenced by cont_ad_get_params(), and cont_ad_set_params().

◆ spf

int32 cont_ad_t::spf

Samples/frame; audio level is analyzed within frames.

Definition at line 185 of file cont_ad.h.

Referenced by cont_ad_calib(), cont_ad_calib_loop(), cont_ad_powhist_dump(), and cont_ad_read().

◆ sps

int32 cont_ad_t::sps

Samples/sec; moved from ad->sps to break dependence on ad by N.

Roy.

Definition at line 180 of file cont_ad.h.

Referenced by cont_ad_powhist_dump().

◆ spseg_head

spseg_t* cont_ad_t::spseg_head

First of unconsumed speech segments.

Definition at line 224 of file cont_ad.h.

Referenced by cont_ad_read(), and cont_ad_reset().

◆ spseg_tail

spseg_t* cont_ad_t::spseg_tail

Last of unconsumed speech segments.

Definition at line 225 of file cont_ad.h.

Referenced by cont_ad_read(), and cont_ad_reset().

◆ state

int32 cont_ad_t::state

State of data returned by most recent cont_ad_read call; CONT_AD_STATE_SIL or CONT_AD_STATE_SPEECH.

Definition at line 165 of file cont_ad.h.

Referenced by cont_ad_read().

◆ tail_state

int32 cont_ad_t::tail_state

State at the end of its internal buffer (internal use): CONT_AD_STATE_SIL or CONT_AD_STATE_SPEECH.

Note: This is different from cont_ad_t.state.

Definition at line 217 of file cont_ad.h.

Referenced by cont_ad_read(), and cont_ad_reset().

◆ thresh_sil

int32 cont_ad_t::thresh_sil

Frame considered to be silence if power <= thresh_sil (for transitioning from SPEECH to SILENCE state)

Definition at line 210 of file cont_ad.h.

◆ thresh_speech

int32 cont_ad_t::thresh_speech

Frame considered to be speech if power >= thresh_speech (for transitioning from SILENCE to SPEECH state)

Definition at line 208 of file cont_ad.h.

◆ thresh_update

int32 cont_ad_t::thresh_update

Number of frames before next update to pow_hist/thresholds.

Definition at line 212 of file cont_ad.h.

◆ tot_frm

int32 cont_ad_t::tot_frm

Total number of frames of A/D data read, including consumed ones.

Definition at line 191 of file cont_ad.h.

Referenced by cont_ad_powhist_dump(), and cont_ad_read().

◆ trailer

int32 cont_ad_t::trailer

pad end of speech with this many extra frms

Definition at line 206 of file cont_ad.h.

Referenced by cont_ad_get_params(), and cont_ad_set_params().

◆ win_startfrm

int32 cont_ad_t::win_startfrm

Where next analysis window begins.

Definition at line 220 of file cont_ad.h.

Referenced by cont_ad_read(), and cont_ad_reset().

◆ win_validfrm

int32 cont_ad_t::win_validfrm

Number of frames currently available from win_startfrm for analysis.

Definition at line 221 of file cont_ad.h.

Referenced by cont_ad_read(), cont_ad_reset(), and cont_ad_set_params().

◆ winsize

int32 cont_ad_t::winsize

how many frames to look at for speech det

Definition at line 202 of file cont_ad.h.

Referenced by cont_ad_get_params(), cont_ad_read(), and cont_ad_set_params().


The documentation for this struct was generated from the following file: