ALBERT — All Library Books, journals and Electronic Records Telegrafenberg

1

Unbekannt

Spoken term detection ALBAYZIN 2014 evaluation: overview, systems, results, and discussion (2015)

Javier Tejedor; Doroteo Toledano; Paula Lopez-Otero; Laura Docio-Fernandez; Carmen Garcia-Mateo; Antonio Cardenal; Julian Echeverry-Correa; Alejandro Coucheiro-Limeres; Julia Olcoz; Antonio Miguel

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-08-08

Beschreibung: Spoken term detection (STD) aims at retrieving data from a speech repository given a textual representation of the search term. Nowadays, it is receiving much interest due to the large volume of multimedia information. STD differs from automatic speech recognition (ASR) in that ASR is interested in all the terms/words that appear in the speech data, whereas STD focuses on a selected list of search terms that must be detected within the speech data. This paper presents the systems submitted to the STD ALBAYZIN 2014 evaluation, held as a part of the ALBAYZIN 2014 evaluation campaign within the context of the IberSPEECH 2014 conference. This is the first STD evaluation that deals with Spanish language. The evaluation consists of retrieving the speech files that contain the search terms, indicating their start and end times within the appropriate speech file, along with a score value that reflects the confidence given to the detection of the search term. The evaluation is conducted on a Spanish spontaneous speech database, which comprises a set of talks from workshops and amounts to about 7 h of speech. We present the database, the evaluation metrics, the systems submitted to the evaluation, the results, and a detailed discussion. Four different research groups took part in the evaluation. Evaluation results show reasonable performance for moderate out-of-vocabulary term rate. This paper compares the systems submitted to the evaluation and makes a deep analysis based on some search term properties (term length, in-vocabulary/out-of-vocabulary terms, single-word/multi-word terms, and in-language/foreign terms).

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

2

Unbekannt

Regularized minimum class variance extreme learning machine for language recognition (2015)

Jiaming Xu; Wei-Qiang Zhang; Jia Liu; Shanhong Xia

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-08-14

Beschreibung: Support vector machines (SVMs) have played an important role in the state-of-the-art language recognition systems. The recently developed extreme learning machine (ELM) tends to have better scalability and achieve similar or much better generalization performance at much faster learning speed than traditional SVM. Inspired by the excellent feature of ELM, in this paper, we propose a novel method called regularized minimum class variance extreme learning machine (RMCVELM) for language recognition. The RMCVELM aims at minimizing empirical risk, structural risk, and the intra-class variance of the training data in the decision space simultaneously. The proposed method, which is computationally inexpensive compared to SVM, suggests a new classifier for language recognition and is evaluated on the 2009 National Institute of Standards and Technology (NIST) language recognition evaluation (LRE). Experimental results show that the proposed RMCVELM obtains much better performance than SVM. In addition, the RMCVELM can also be applied to the popular i-vector space and get comparable results to the existing scoring methods.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

3

Unbekannt

Emotion in the singing voice—a deeperlook at acoustic features in the light ofautomatic classification (2015)

Florian Eyben; Gláucia Salomão; Johan Sundberg; Klaus Scherer; Björn Schuller

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-08-15

Beschreibung: We investigate the automatic recognition of emotions in the singing voice and study the worth and role of a variety of relevant acoustic parameters. The data set contains phrases and vocalises sung by eight renowned professional opera singers in ten different emotions and a neutral state. The states are mapped to ternary arousal and valence labels. We propose a small set of relevant acoustic features basing on our previous findings on the same data and compare it with a large-scale state-of-the-art feature set for paralinguistics recognition, the baseline feature set of the Interspeech 2013 Computational Paralinguistics ChallengE (ComParE). A feature importance analysis with respect to classification accuracy and correlation of features with the targets is provided in the paper. Results show that the classification performance with both feature sets is similar for arousal, while the ComParE set is superior for valence. Intra singer feature ranking criteria further improve the classification accuracy in a leave-one-singer-out cross validation significantly.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

4

Unbekannt

Exploiting spectro-temporal locality in deep learning based acoustic event detection (2015)

Miquel Espi; Masakiyo Fujimoto; Keisuke Kinoshita; Tomohiro Nakatani

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-09-15

Beschreibung: In recent years, deep learning has not only permeated the computer vision and speech recognition research fields but also fields such as acoustic event detection (AED). One of the aims of AED is to detect and classify non-speech acoustic events occurring in conversation scenes including those produced by both humans and the objects that surround us. In AED, deep learning has enabled modeling of detail-rich features, and among these, high resolution spectrograms have shown a significant advantage over existing predefined features (e.g., Mel-filter bank) that compress and reduce detail. In this paper, we further asses the importance of feature extraction for deep learning-based acoustic event detection. AED, based on spectrogram-input deep neural networks, exploits the fact that sounds have “global” spectral patterns, but sounds also have “local” properties such as being more transient or smoother in the time-frequency domain. These can be exposed by adjusting the time-frequency resolution used to compute the spectrogram, or by using a model that exploits locality leading us to explore two different feature extraction strategies in the context of deep learning: (1) using multiple resolution spectrograms simultaneously and analyzing the overall and event-wise influence to combine the results, and (2) introducing the use of convolutional neural networks (CNN), a state of the art 2D feature extraction model that exploits local structures, with log power spectrogram input for AED. An experimental evaluation shows that the approaches we describe outperform our state-of-the-art deep learning baseline with a noticeable gain in the CNN case and provides insights regarding CNN-based spectrogram characterization for AED.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

5

Unbekannt

Biomimetic spectro-temporal features for music instrument recognition in isolated notes and solo phrases (2015)

Kailash Patil; Mounya Elhilali

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-09-26

Beschreibung: The identity of musical instruments is reflected in the acoustic attributes of musical notes played with them. Recently, it has been argued that these characteristics of musical identity (or timbre) can be best captured through an analysis that encompasses both time and frequency domains; with a focus on the modulations or changes in the signal in the spectrotemporal space. This representation mimics the spectrotemporal receptive field (STRF) analysis believed to underlie processing in the central mammalian auditory system, particularly at the level of primary auditory cortex. How well does this STRF representation capture timbral identity of musical instruments in continuous solo recordings remains unclear. The current work investigates the applicability of the STRF feature space for instrument recognition in solo musical phrases and explores best approaches to leveraging knowledge from isolated musical notes for instrument recognition in solo recordings. The study presents an approach for parsing solo performances into their individual note constituents and adapting back-end classifiers using support vector machines to achieve a generalization of instrument recognition to off-the-shelf, commercially available solo music.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

6

Unbekannt

Small-parallel exemplar-based voice conversion in noisy environments using affine non-negative matrix factorization (2015)

Ryo Aihara; Takao Fujii; Toru Nakashika; Tetsuya Takiguchi; Yasuo Ariki

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-11-26

Beschreibung: The need to have a large amount of parallel data is a large hurdle for the practical use of voice conversion (VC). This paper presents a novel framework of exemplar-based VC that only requires a small number of parallel exemplars. In our previous work, a VC technique using non-negative matrix factorization (NMF) for noisy environments was proposed. This method requires parallel exemplars (which consist of the source exemplars and target exemplars that have the same texts uttered by the source and target speakers) for dictionary construction. In the framework of conventional Gaussian mixture model (GMM)-based VC, some approaches that do not need parallel exemplars have been proposed. However, in the framework of exemplar-based VC for noisy environments, such a method has never been proposed. In this paper, an adaptation matrix in an NMF framework is introduced to adapt the source dictionary to the target dictionary. This adaptation matrix is estimated using only a small parallel speech corpus. We refer to this method as affine NMF, and the effectiveness of this method has been confirmed by comparing its effectiveness with that of a conventional NMF-based method and a GMM-based method in noisy environments.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

7

Unbekannt

Feedback recurrent neural network-based embedded vector and its application in topic model (2016)

Lian-sheng Li, Sheng-jiang Gan and Xiang-dong Yin

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-07-17

Beschreibung: While mining topics in a document collection, in order to capture the relationships between words and further improve the effectiveness of discovered topics, this paper proposed a feedback recurrent neural net...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

8

Unbekannt

Optimizing linear routing in the ToLHnet protocol to improve performance over long RS-485 buses (2016)

Michele Alessandrini, Giorgio Biagetti, Paolo Crippa, Laura Falaschetti, Simone Orcioni and Claudio Turchetti

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-07-26

Beschreibung: As the adoption of sensing and control networks rises to encompass the most diverse fields, the need for simple, efficient interconnection between many different devices will become ever more pressing. Though ...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

9

Unbekannt

Energy-aware memory management for embedded multidimensional signal processing applications (2016)

Florin Balasa, Noha Abuaesh, Cristian V. Gingu, Ilie I. Luican and Hongwei Zhu

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-07-26

Beschreibung: In real-time data-intensive multimedia processing applications, data transfer and storage significantly influence, if not dominate, all the major cost parameters of the design space—namely energy consumption, ...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

10

Unbekannt

A low-power wireless system for energy consumption analysis at mains sockets (2016)

Matthias Altmann, Peter Schlegl and Klaus Volbert

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-07-31

Beschreibung: Improving energy efficiency and reducing energy wastage is an important topic of our time. But it is quite difficult to figure out how much of our total electricity bill can be mapped to which device or at wha...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

11

Unbekannt

A hybrid fixed-function and microprocessor solution for high-throughput broad-phase collision detection (2016)

Muiris Woulfe and Michael Manzke

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-07-15

Beschreibung: We present a hybrid system spanning a fixed-function microarchitecture and a general-purpose microprocessor, designed to amplify the throughput and decrease the power dissipation of collision detection relativ...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

12

Unbekannt

Voice activity detection algorithm based on long-term pitch information (2016)

Xu-Kui Yang, Liang He, Dan Qu and Wei-Qiang Zhang

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-07-09

Beschreibung: A new voice activity detection algorithm based on long-term pitch divergence is presented. The long-term pitch divergence not only decomposes speech signals with a bionic decomposition but also makes full use ...

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

13

Unbekannt

Research on recovery strategy in embedded real-time main memory databases (2016)

Tan Yonghong and Yin Xiangdong

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-05-06

Beschreibung: In order to recover data from embedded real-time main memory databases effectively and efficiently, this paper proposes a real-time log-based recovery approach. With respect to the real-time requirement in emb...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

14

Unbekannt

E-transportation: the role of embedded systems in electric energy transfer from grid to vehicle (2016)

Federico Baronti, Mo-Yuen Chow, Chengbin Ma, Habiballah Rahimi-Eichi and Roberto Saletti

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-05-11

Beschreibung: Electric vehicles (EVs) are a promising solution to reduce the transportation dependency on oil, as well as the environmental concerns. Realization of E-transportation relies on providing electrical energy to ...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

15

Unbekannt

Backscattering UWB/UHF hybrid solutions for multi-reader multi-tag passive RFID systems (2016)

Tan Yonghong, Yin Xiangdong, Roberto Alesii, Piergiuseppe Di Marco, Fortunato Santucci, Pietro Savazzi, Roberto Valentini and Anna Vizziello

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-05-06

Beschreibung: Ultra-wideband (UWB) technology is foreseen as a promising solution to overcome the limits of ultra-high frequency (UHF) techniques toward the development of green radio frequency identification (RFID) systems...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

16

Unbekannt

Low power memory allocation and mapping for area-constrained systems-on-chips (2016)

Manuel Strobel, Marcus Eggenberger and Martin Radetzki

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-07-19

Beschreibung: Large fractions of today’s embedded systems’ power consumption can be attributed to the memory subsystem. In order to reduce this fraction, we propose a mathematical model to optimize on-chip memory configurat...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

17

Unbekannt

Sensing user context and habits for run-time energy optimization (2016)

Ismat Chaib Draa, Smail Niar, Jamel Tayeb, Emmanuelle Grislin and Mikael Desertot

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-07-19

Beschreibung: Optimizing energy consumption in modern mobile handheld devices plays a very important role as lowering energy consumption impacts battery life and system reliability. With next-generation smartphones and tabl...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

18

Unbekannt

Query-by-Example Spoken Term Detection ALBAYZIN 2012 evaluation: overview, systems, results, and discussion (2013)

Javier Tejedor; Doroteo Toledano; Xavier Anguera; Amparo Varona; Lluís Hurtado; Antonio Miguel; José Colás

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2013-09-18

Beschreibung: Query-by-Example Spoken Term Detection (QbE STD) aims at retrieving data from a speech data repository given an acoustic query containing the term of interest as input. Nowadays, it has been receiving much interest due to the high volume of information stored in audio or audiovisual format. QbE STD differs from Automatic speech recognition (ASR) and keyword spotting (KWS)/spoken term detection (STD) since ASR is interested in all the terms/words that appear in the speech signal and KWS/STD relies on a textual transcription of the search term to retrieve the speech data. This paper presents the systems submitted to the ALBAYZIN 2012 QbE STD evaluation held as a part of ALBAYZIN 2012 evaluation campaign within the context of the IberSPEECH 2012 Conferencea. The evaluation consists of retrieving the speech files that contain the input queries, indicating their start and end timestamps within the appropriate speech file. Evaluation is conducted on a Spanish spontaneous speech database containing a set of talks from MAVIR workshopsb, which amount at about 7 h of speech in total. We present the database metric systems submitted along with all results and some discussion. Four different research groups took part in the evaluation. Evaluation results show the difficulty of this task and the limited performance indicates there is still a lot of room for improvement. The best result is achieved by a dynamic time warping-based search over Gaussian posteriorgrams/posterior phoneme probabilities. This paper also compares the systems aiming at establishing the best technique dealing with that difficult task and looking for defining promising directions for this relatively novel task.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

19

Unbekannt

An audio watermark-based speech bandwidth extension method (2013)

Zhe Chen; Chengyong Zhao; Guosheng Geng; Fuliang Yin

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2013-06-07

Beschreibung: A novel speech bandwidth extension method based on audio watermark is presented in this paper. The time-domain and frequency-domain envelope parameters are extracted from the high-frequency components of speech signal, and then these parameters are embedded in the corresponding narrowband speech bit stream by the modified least significant bit watermark method which uses perception property. At the decoder, the wideband speech is reproduced with the reconstruction of high-frequency components based on the parameters extracted from bit stream of the narrowband speech. The proposed method can decrease poor auditory effect caused by large local distortion. The simulation results show that the synthesized wideband speech has low spectral distortion and its speech perception quality is greatly improved.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

20

Unbekannt

Deep neural network-based bottleneck feature and denoising autoencoder-based dereverberation for distant-talking speaker identification (2015)

Zhaofeng Zhang; Longbiao Wang; Atsuhiko Kai; Takanori Yamada; Weifeng Li; Masahiro Iwahashi

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-05-13

Beschreibung: Deep neural network (DNN)-based approaches have been shown to be effective in many automatic speech recognition systems. However, few works have focused on DNNs for distant-talking speaker recognition. In this study, a bottleneck feature derived from a DNN and a cepstral domain denoising autoencoder (DAE)-based dereverberation are presented for distant-talking speaker identification, and a combination of these two approaches is proposed. For the DNN-based bottleneck feature, we noted that DNNs can transform the reverberant speech feature to a new feature space with greater discriminative classification ability for distant-talking speaker recognition. Conversely, cepstral domain DAE-based dereverberation tries to suppress the reverberation by mapping the cepstrum of reverberant speech to that of clean speech with the expectation of improving the performance of distant-talking speaker recognition. Since the DNN-based discriminant bottleneck feature and DAE-based dereverberation have a strong complementary nature, the combination of these two methods is expected to be very effective for distant-talking speaker identification. A speaker identification experiment was performed on a distant-talking speech set, with reverberant environments differing from the training environments. In suppressing late reverberation, our method outperformed some state-of-the-art dereverberation approaches such as the multichannel least mean squares (MCLMS). Compared with the MCLMS, we obtained a reduction in relative error rates of 21.4% for the bottleneck feature and 47.0% for the autoencoder feature. Moreover, the combination of likelihoods of the DNN-based bottleneck feature and DAE-based dereverberation further improved the performance.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

21

Unbekannt

Lightweight multi-DOA tracking of mobile speech sources (2015)

Caleb Rascon; Gibran Fuentes; Ivan Meza

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-05-08

Beschreibung: Estimating the directions of arrival (DOAs) of multiple simultaneous mobile sound sources is an important step for various audio signal processing applications. In this contribution, we present an approach that improves upon our previous work that is now able to estimate the DOAs of multiple mobile speech sources, while being light in resources, both hardware-wise (only using three microphones) and software-wise. This approach takes advantage of the fact that simultaneous speech sources do not completely overlap each other. To evaluate the performance of this approach, a multi-DOA estimation evaluation system was developed based on a corpus collected from different acoustic scenarios named Acoustic Interactions for Robot Audition (AIRA).

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

22

Unbekannt

Wise teachers train better DNN acoustic models (2016)

Ryan Price, Ken-ichi Iso and Koichi Shinoda

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-04-13

Beschreibung: Automatic speech recognition is becoming more ubiquitous as recognition performance improves, capable devices increase in number, and areas of new application open up. Neural network acoustic models that can u...

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

23

Unbekannt

Adaptive Aloha anti-collision algorithms for RFID systems (2016)

Feng Zheng and Thomas Kaiser

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-04-13

Beschreibung: In this paper, we propose two adaptive frame size Aloha algorithms, namely adaptive frame size Aloha 1 (AFSA1) and adaptive frame size Aloha 2 (AFSA2), for solving radio frequency identification (RFID) multipl...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

24

Unbekannt

Speech signal modeling using multivariate distributions (2015)

Ali Aroudi; Hadi Veisi; Hossein Sameti; Zahra Mafakheri

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-12-31

Beschreibung: Using a proper distribution function for speech signal or for its representations is of crucial importance in statistical-based speech processing algorithms. Although the most commonly used probability density function (pdf) for speech signals is Gaussian, recent studies have shown the superiority of super-Gaussian pdfs. A large research effort has focused on the investigation of a univariate case of speech signal distribution; however, in this paper, we study the multivariate distributions of speech signal and its representations using the conventional distribution functions, e.g., multivariate Gaussian and multivariate Laplace, and the copula-based multivariate distributions as candidates. The copula-based technique is a powerful method in modeling non-Gaussian multivariate distributions with non-linear inter-dimensional dependency. The level of similarity between the candidate pdfs and the real speech pdf in different domains is evaluated using the energy goodness-of-fit test.In our evaluations, the best-fitted distributions for speech signal vectors with different lengths in various domains are determined. A similar experiment is performed for different classes of English phonemes (fricatives, nasals, stops, vowels, and semivowel/glides). The evaluation results demonstrate that the multivariate distribution of speech signals in different domains is mostly super-Gaussian, except for Mel-frequency cepstral coefficient. Also, the results confirm that the distribution of the different phoneme classes is better statistically modeled by a mixture of Gaussian and Laplace pdfs. The copula-based distributions provide better statistical modeling of vectors representing discrete Fourier transform (DFT) amplitude of speech vectors with a length shorter than 500 ms.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

25

Unbekannt

Localization based stereo speech source separation using probabilistic time-frequency masking and deep neural networks (2016)

Yang Yu, Wenwu Wang and Peng Han

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-03-06

Beschreibung: Time-frequency (T-F) masking is an effective method for stereo speech source separation. However, reliable estimation of the T-F mask from sound mixtures is a challenging task, especially when room reverberati...

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

26

Unbekannt

Embedded mobile crowd service systems based on opportunistic geological grid and dynamical segmentation (2015)

Chunhua Dong; Li Wang; Kunming Zhao

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-12-27

Beschreibung: In order to solve these problems such as the demand of geographic information service and the short life of the embedded system, as well as network collapse, and so on, the embedded mobile crowd service systems based on opportunistic geological grid and dynamical split was proposed. Firstly, based on the characteristics of geographical spatial information resources and service time series, a mobile geographic crowd service system was established for providing the sensing data with the mobile geographic crowd service model. Then, according to the embedded equipment complex data of the geographic crowd service system, and the relationship between the geography information service object and the user, the embedded system was proposed based on the opportunity geological grid. Finally, the optimization of the geographic crowd system was realized by the dynamic segmentation of the opportunity geographic grid. The experiment results of the equipment utilization, the life cycle of the crowd network, user satisfaction, and control complexity show that the proposed scheme is more suitable for the embedded network geographic information system.

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

27

Unbekannt

Profit-oriented task scheduling algorithm in Hadoop cluster (2016)

Xu-qing Chai, Yong-liang Dong and Jun-fei Li

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-03-31

Beschreibung: Nowadays, many enterprises provide cloud services based on their own Hadoop clusters. Because the resources of a Hadoop cluster are limited, the Hadoop cluster must select some specific tasks to allocate limit...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

28

Unbekannt

Music detection from broadcast contents using convolutional neural networks with a Mel-scale kernel (2019)

Byeong-Yong Jang, Woon-Haeng Heo, Jung-Hyun Kim and Oh-Wook Kwon

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2019

Beschreibung: We propose a new method for music detection from broadcasting contents using the convolutional neural networks with a Mel-scale kernel. In this detection task, music segments should be annotated from the broad...

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

29

Unbekannt

A new joint CTC-attention-based speech recognition model with multi-level multi-head attention (2019)

Chu-Xiong Qin, Wen-Lin Zhang and Dan Qu

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2019

Beschreibung: A method called joint connectionist temporal classification (CTC)-attention-based speech recognition has recently received increasing focus and has achieved impressive performance. A hybrid end-to-end architec...

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

30

Unbekannt

ALBAYZIN 2018 spoken term detection evaluation: a multi-domain international evaluation in Spanish (2019)

Javier Tejedor, Doroteo T. Toledano, Paula Lopez-Otero, Laura Docio-Fernandez, Ana R. Montalvo, Jose M. Ramirez, Mikel Peñagarikano and Luis Javier Rodriguez-Fuentes

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2019

Beschreibung: Search on speech (SoS) is a challenging area due to the huge amount of information stored in audio and video repositories. Spoken term detection (STD) is an SoS-related task aiming to retrieve data from a spee...

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

31

Unbekannt

Articulation constrained learning with application to speech emotion recognition (2019)

Mohit Shah, Ming Tu, Visar Berisha, Chaitali Chakrabarti and Andreas Spanias

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2019

Beschreibung: Speech emotion recognition methods combining articulatory information with acoustic features have been previously shown to improve recognition performance. Collection of articulatory data on a large scale may ...

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

32

Unbekannt

Erratum to: Efficient voice activity detection algorithm using long-term spectral flatness measure (2015)

Yanna Ma; Akinori Nishihara

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-10-21

Beschreibung: No description available

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

33

Unbekannt

Semi-fragile digital speech watermarking for online speaker recognition (2015)

Mohammad Nematollahi; Mohammad Akhaee; S. Al-Haddad; Hamurabi Gamboa-Rosales

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-10-22

Beschreibung: In this paper, a semi-fragile and blind digital speech watermarking technique for online speaker recognition systems based on the discrete wavelet packet transform (DWPT) and quantization index modulation (QIM) has been proposed that enables embedding of the watermark within an angle of the wavelet’s sub-bands. To minimize the degradation effects of the watermark, these sub-bands were selected from frequency ranges where little speaker-specific information was available (500–3500 Hz and 6000–7000 Hz). Experimental results on the TIMIT, MIT, and MOBIO speech databases show that the degradation results for speaker verification and identification are 0.39 and 0.97 %, respectively, which are negligible. In addition, the proposed watermark technique can provide the appropriate fragility required for different signal processing operations.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

34

Unbekannt

Physical task stress and speaker variability in voice quality (2015)

Keith Godin; John Hansen

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-10-22

Beschreibung: The presence of physical task stress induces changes in the speech production system which in turn produces changes in speaking behavior. This results in measurable acoustic correlates including changes to formant center frequencies, breath pause placement, and fundamental frequency. Many of these changes are due to the subject’s internal competition between speaking and breathing during the performance of the physical task, which has a corresponding impact on muscle control and airflow within the glottal excitation structure as well as vocal tract articulatory structure. This study considers the effect of physical task stress on voice quality. Three signal processing-based values which include (i) the normalized amplitude quotient (NAQ), (ii) the harmonic richness factor (HRF), and (iii) the fundamental frequency are used to measure voice quality. The effects of physical stress on voice quality depend on the speaker as well as the specific task. While some speakers do not exhibit changes in voice quality, a subset exhibits changes in NAQ and HRF measures of similar magnitude to those observed in studies of soft, loud, and pressed speech. For those speakers demonstrating voice quality changes, the observed changes tend toward breathy or soft voicing as observed in other studies. The effect of physical stress on the fundamental frequency is correlated with the effect of physical stress on the HRF (r = −0.34) and the NAQ (r = −0.53). Also, the inter-speaker variation in baseline NAQ is significantly higher than the variation in NAQ induced by physical task stress. The results illustrate systematic changes in speech production under physical task stress, which in theory will impact subsequent speech technology such as speech recognition, speaker recognition, and voice diarization systems.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

35

Unbekannt

An acoustic data transmission system based on audio data hiding: method and performance evaluation (2015)

Kiho Cho; Jae Choi; Nam Kim

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-07-17

Beschreibung: Acoustic data transmission (ADT) forms a branch of the audio data hiding techniques with its capability of communicating data in short-range aerial space between a loudspeaker and a microphone. In this paper, we propose an acoustic data transmission system extending our previous studies and give an in-depth analysis of its performance. The proposed technique utilizes the phases of modulated complex lapped transform (MCLT) coefficients of the audio signal. To achieve a good trade-off between the audio quality and the data transmission performance, the enhanced segmental SNR adjustment (SSA) algorithm is proposed. Moreover, we also propose a scheme to use multiple microphones for ADT technique. This multi-microphone ADT technique further enhances the transmission performance while ensuring compatibility with the single microphone system. From a series of experimental results, it has been found that the transmission performance improves when the length of the MCLT frame gets longer at the cost of the audio quality degradation. In addition, a good trade-off between the audio quality and data transmission performance is achieved by means of SSA algorithm. The experimental results also reveal that the proposed multi-microphone method is useful in enhancing the transmission performance.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

36

Unbekannt

Comparison of smart grid architectures for monitoring and analyzing power grid data via Modbus and REST (2016)

Susanne Kenner, Raphael Thaler, Markus Kucera, Klaus Volbert and Thomas Waas

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-08-13

Beschreibung: Smart grid, smart metering, electromobility, and the regulation of the power network are keywords of the transition in energy politics. In the future, the power grid will be smart. Based on different works, th...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

37

Unbekannt

A hybrid input-type recurrent neural network for LVCSR language modeling (2016)

Vataya Chunwijitra, Ananlada Chotimongkol and Chai Wutiwiwatchai

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-08-10

Beschreibung: Substantial amounts of resources are usually required to robustly develop a language model for an open vocabulary speech recognition system as out-of-vocabulary (OOV) words can hurt recognition accuracy. In th...

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

38

Unbekannt

Embedded solutions for a class of highly unstable, underactuated and self-balancing robotic systems (2016)

Andrea Bonci, Massimiliano Pirani and Sauro Longhi

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-08-11

Beschreibung: This paper presents a didactic framework in embedded electronics systems that is used to elicit awareness into students and engineers on the design issues arising in the realization of a class of underactuated...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

39

Unbekannt

Embedded acoustic emission system based on rock sound source crowd location (2016)

Zhang Qing

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-08-12

Beschreibung: Rock acoustic emission is often used to study the evolution of brittle materials. The cause of rock internal damage can be monitored continuously and real-timely by sensing rock acoustic wave. However, the key...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

40

Unbekannt

FPGA implementation of JPEG encoder architectures for wireless networks (2016)

C. Scavongelli and M. Conti

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-08-12

Beschreibung: Due to its relative simplicity, the JPEG compression algorithm requires less hardware or software resources with respect to new compression algorithms, for example the JPEG2000 and the JPEG XR. This makes it s...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

41

Unbekannt

Symbolic execution and timed automata model checking for timing analysis of Java real-time systems (2015)

Kasper Luckow; Corina P¿s¿reanu; Bent Thomsen

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-09-30

Beschreibung: This paper presents SymRT, a tool based on a combination of symbolic execution and real-time model checking for timing analysis of Java systems. Symbolic execution is used for the generation of a safe and tight timing model of the analyzed system capturing the feasible execution paths. The model is combined with suitable execution environment models capturing the timing behavior of the target host platform including the Java virtual machine and complex hardware features such as caching. The complete timing model is a network of timed automata which directly facilitates safe estimates of worst and best case execution time to be determined using the Uppaal model checker. Furthermore, the integration of the proposed techniques into the TetaSARTS tool facilitates reasoning about additional timing properties such as the schedulability of periodically and sporadically released Java real-time tasks (under specific scheduling policies), worst case response time, and more.

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

42

Unbekannt

Dynamic power management for reactive stream processing on the SCC tiled architecture (2016)

Nilesh Karavadara, Michael Zolda, Vu Thien Nga Nguyen, Jens Knoop and Raimund Kirner

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-06-15

Beschreibung: Dynamic voltage and frequency scaling (DVFS) is a means to adjust the computing capacity and power consumption of computing systems to the application demands. DVFS is generally useful to...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

43

Unbekannt

Big data services drive mobile crowd embedded opportunistic control mechanism for biological engineering (2016)

Hai-Chao Wang and Zeng Dong

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-06-03

Beschreibung: Big data of biological engineering and mobile control increase the complexity of system control. In order to resolve the above problems and improve biological engineering system performance, this paper propose...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

44

Unbekannt

Research on the fusion mechanism of cooperative embedded filtering and crowd content recommendation (2016)

Chen Yu-yun

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-09-01

Beschreibung: Internet simultaneous services of large-scale users will lead to server overload and information failure. Static content recommendation system cannot adapt to the dynamic similarity characteristics of users. S...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

45

Unbekannt

A design methodology for soft-core platforms on FPGA with SMP Linux, OpenMP support, and distributed hardware profiling system (2016)

Vittoriano Muttillo, Giacomo Valente, Fabio Federici, Luigi Pomante, Marco Faccio, Carlo Tieri and Serenella Ferri

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-09-17

Beschreibung: In recent years, the use of multiprocessor systems has become increasingly common. Even in the embedded domain, the development of platforms based on multiprocessor systems or the porting of legacy single-core...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

46

Unbekannt

Mechanical hydraulic characteristic analysis scheme based on lightweight crowd data in mobile embedded devices (2016)

Shu-yi Guo and Qi Si

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-08-23

Beschreibung: In order to improve the efficiency of mechanical and hydraulic control of the mechanical equipment, the analysis scheme of mechanical hydraulic characteristics based on lightweight crowd data was proposed in m...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

47

Unbekannt

ViSQOL: an objective speech quality model (2015)

Andrew Hines; Jan Skoglund; Anil Kokaram; Naomi Harte

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-05-22

Beschreibung: This paper presents an objective speech quality model, ViSQOL, the Virtual Speech Quality Objective Listener. It is a signal-based, full-reference, intrusive metric that models human speech quality perception using a spectro-temporal measure of similarity between a reference and a test speech signal. The metric has been particularly designed to be robust for quality issues associated with Voice over IP (VoIP) transmission. This paper describes the algorithm and compares the quality predictions with the ITU-T standard metrics PESQ and POLQA for common problems in VoIP: clock drift, associated time warping, and playout delays. The results indicate that ViSQOL and POLQA significantly outperform PESQ, with ViSQOL competing well with POLQA. An extensive benchmarking against PESQ, POLQA, and simpler distance metrics using three speech corpora (NOIZEUS and E4 and the ITU-T P.Sup. 23 database) is also presented. These experiments benchmark the performance for a wide range of quality impairments, including VoIP degradations, a variety of background noise types, speech enhancement methods, and SNR levels. The results and subsequent analysis show that both ViSQOL and POLQA have some performance weaknesses and under-predict perceived quality in certain VoIP conditions. Both have a wider application and robustness to conditions than PESQ or more trivial distance metrics. ViSQOL is shown to offer a useful alternative to POLQA in predicting speech quality in VoIP scenarios.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

48

Unbekannt

An improved i-vector extraction algorithm for speaker verification (2015)

Wei Li; Tianfan Fu; Jie Zhu

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-06-28

Beschreibung: Over recent years, i-vector-based framework has been proven to provide state-of-the-art performance in speaker verification. Each utterance is projected onto a total factor space and is represented by a low-dimensional feature vector. Channel compensation techniques are carried out in this low-dimensional feature space. Most of the compensation techniques take the sets of extracted i-vectors as input. By constructing between-class covariance and within-class covariance, we attempt to minimize the between-class variance mainly caused by channel effect and to maximize the variance between speakers. In the real-world application, enrollment and test data from each user (or speaker) are always scarce. Although it is widely thought that session variability is mostly caused by channel effects, phonetic variability, as a factor that causes session variability, is still a matter to be considered. We propose in this paper a new i-vector extraction algorithm from the total factor matrix which we term component reduction analysis (CRA). This new algorithm contributes to better modelling of session variability in the total factor space.We reported results on the male English trials of the core condition of the NIST 2008 Speaker Recognition Evaluation (SREs) dataset. As measured both by equal error rate and the minimum values of the NIST detection cost function, 10–15 % relative improvement is achieved compared to the baseline of traditional i-vector-based system.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

49

Unbekannt

Singer identification using perceptual features and cepstral coefficients of an audio signal from Indian video songs (2015)

Tushar Ratanpara; Narendra Patel

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-06-26

Beschreibung: Singer identification is a difficult topic in music information retrieval because background instrumental music is included with singing voice which reduces performance of a system. One of the main disadvantages of the existing system is vocals and instrumental are separated manually and only vocals are used to build training model. The research presented in this paper automatically recognize a singer without separating instrumental and singing sounds using audio features like timbre coefficients, pitch class, mel frequency cepstral coefficients (MFCC), linear predictive coding (LPC) coefficients, and loudness of an audio signal from Indian video songs (IVS). Initially, various IVS of distinct playback singers (PS) are collected. After that, 53 audio features (12 dimensional timbre audio feature vectors, 12 pitch classes, 13 MFCC coefficients, 13 LPC coefficients, and 3 loudness feature vector of an audio signal) are extracted from each segment. Dimension of extracted audio features is reduced using principal component analysis (PCA) method. Playback singer model (PSM) is trained using multiclass classification algorithms like back propagation, AdaBoost.M2, k-nearest neighbor (KNN) algorithm, naïve Bayes classifier (NBC), and Gaussian mixture model (GMM). The proposed approach is tested on various combinations of dataset and different combinations of audio feature vectors with various Indian male and female PS’s songs.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

50

Unbekannt

Exploiting foreign resources for DNN-based ASR (2015)

Petr Motlicek; David Imseng; Blaise Potard; Philip Garner; Ivan Himawan

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-06-27

Beschreibung: Manual transcription of audio databases for the development of automatic speech recognition (ASR) systems is a costly and time-consuming process. In the context of deriving acoustic models adapted to a specific application, or in low-resource scenarios, it is therefore essential to explore alternatives capable of improving speech recognition results. In this paper, we investigate the relevance of foreign data characteristics, in particular domain and language, when using this data as an auxiliary data source for training ASR acoustic models based on deep neural networks (DNNs). The acoustic models are evaluated on a challenging bilingual database within the scope of the MediaParl project. Experimental results suggest that in-language (but out-of-domain) data is more beneficial than in-domain (but out-of-language) data when employed in either supervised or semi-supervised training of DNNs. The best performing ASR system, an HMM/GMM acoustic model that exploits DNN as a discriminatively trained feature extractor outperforms the best performing HMM/DNN hybrid by about 5 % relative (in terms of WER). An accumulated relative gain with respect to the MFCC-HMM/GMM baseline is about 30 % WER.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

51

Unbekannt

Stereo-based histogram equalization for robust speech recognition (2015)

Randa Al-Wakeel; Mahmoud Shoman; Magdy Aboul-Ela; Sherif Abdou

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-06-10

Beschreibung: Optimal automatic speech recognition (ASR) takes place when the recognition system is tested under circumstances identical to those in which it was trained. However, in the actual real world, there exist many sources of mismatches between the environment of training and the environment of testing. These sources can be due to the sources of noise that exist in real environments. Speech enhancement techniques have been developed to provide ASR systems with the robustness against the sources of noise. In this work, a method based on histogram equalization (HEQ) was proposed to compensate for the nonlinear distortions in speech representation. This approach utilizes stereo simultaneous recordings for clean speech and its corresponding noisy speech to compute stereo Gaussian mixture model (GMM). The stereo GMM is used to compute the cumulative density function (CDF) for both clean speech and noisy speech using a sigmoid function instead of using the order statistics that is used in other HEQ-based methods. In the implementation, we show two choices to apply HEQ, hard decision HEQ and soft decision HEQ. The latter is based on minimum mean square error (MMSE) clean speech estimation. The experimental work shows that the soft HEQ and hard HEQ achieve better recognition results than the other HEQ approaches such as tabular HEQ, quantile HEQ and polynomial fit HEQ. It also shows that soft HEQ achieves notably better recognition results than hard HEQ. The results of the experimental work also show that using HEQ improves the efficiency of other speech enhancement techniques such as stereo piece-wise linear compensation for environment (SPLICE) and vector Taylor series (VTS). The results also show that using HEQ in multi style training (MST) significantly improves the ASR system performance.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

52

Unbekannt

Simulation of tremulous voices using a biomechanical model (2015)

Rubén Fraile; Juan Godino-Llorente; Malte Kob

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-01-20

Beschreibung: Vocal tremor has been simulated using a high-dimensional discrete vocal fold model. Specifically, respiratory, phonatory, and articulatory tremors have been modeled as instabilities in six parameters of the model. Reported results are consistent with previous knowledge in that respiratory tremor mainly causes amplitude modulation of the voice signal while laryngeal tremor causes both amplitude and frequency modulation. In turn, articulatory tremor is commonly assumed to produce only amplitude modulations but the simulation results indicate that it also produces a high-frequency modulation of the output signal. Furthermore, articulatory tremor affects the frequency response of the vocal tract and it might thus be detected by analyzing the spectral envelope of the acoustic signal.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

53

Unbekannt

Dedicated object processor for mobile augmented reality - sailor assistance case study (2015)

Jean-Philippe Diguet; Neil Bergmann; Jean-Christophe Morgère

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-01-23

Beschreibung: This paper addresses the design of embedded systems for outdoor augmented reality (AR) applications integrated to see-through glasses. The set of tasks includes object positioning, graphic computation, as well as wireless communications, and we consider constraints such as real-time, low power, and low footprint. We introduce an original sailor assistance application, as a typical, useful, and complex outdoor AR application, where context-dependent virtual objects must be placed in the user field of view according to head motions and ambient information. Our study demonstrates that it is worth working on power optimization, since the embedded system based on a standard general-purpose processor (GPP) + graphics processing unit (GPU) consumes more than high-luminosity see-through glasses. This work presents then three main contributions, the first one is the choice and combinations of position and attitude algorithms that fit with the application context. The second one is the architecture of the embedded system, where it is introduced as a fast and simple object processor (OP) optimized for the domain of mobile AR. Finally, the OP implements a new pixel rendering method (incremental pixel shader (IPS)), which is implemented in hardware and takes full advantage of OpenGL ES light model. A GP+OP(s) complete architecture is described and prototyped on field programmable gate-array (FPGA). It includes hardware/software partitioning based on the analysis of application requirements and ergonomics.

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

54

Unbekannt

A novel hybrid of genetic algorithm and ANN for developing a high efficient method for vocal fold pathology diagnosis (2015)

Vahid Majidnezhad

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-01-23

Beschreibung: In this paper, an initial feature vector based on the combination of the wavelet packet decomposition (WPD) and the Mel frequency cepstral coefficients (MFCCs) is proposed. For optimizing the initial feature vector, a genetic algorithm (GA)-based approach is proposed and compared with the well-known principal component analysis (PCA) approach. The artificial neural network (ANN) with the different learning algorithms is used as the classifier. Some experiments are carried out for evaluating and comparing the classification accuracies which are obtained by the use of the different learning algorithms and the different feature vectors (the initial and the optimized ones). Finally, a hybrid of the ANN with the `trainscg? training algorithm and the genetic algorithm is proposed for the vocal fold pathology diagnosis. Also, the performance of the proposed method is compared with the recent works. The experiments' results show a better performance (the higher classification accuracy) of the proposed method in comparison with the others.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

55

Unbekannt

A signal subspace approach to spatio-temporal prediction for multichannel speech enhancement (2015)

Adam Borowicz

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-02-12

Beschreibung: The spatio-temporal-prediction (STP) method for multichannel speech enhancement has recently been proposed. This approach makes it theoretically possible to attenuate the residual noise without distorting speech. In addition, the STP method depends only on the second-order statistics and can be implemented using a simple linear filtering framework. Unfortunately, some numerical problems can arise when estimating the filter matrix in transients. In such a case, the speech correlation matrix is usually rank deficient, so that no solution exists. In this paper, we propose to implement the spatio-temporal-prediction method using a signal subspace approach. This allows for nullifying the noise subspace and processing only the noisy signal in the signal-plus-noise subspace. As a result, we are able to not only regularize the solution in transients but also to achieve higher attenuation of the residual noise. The experimental results also show that the signal subspace approach distorts speech less than the conventional method.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

56

Unbekannt

Noisy training for deep neural networks in speech recognition (2015)

Shi Yin; Chao Liu; Zhiyong Zhang; Yiye Lin; Dong Wang; Javier Tejedor; Thomas Zheng; Yinguo Li

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-02-12

Beschreibung: Deep neural networks (DNNs) have gained remarkable success in speech recognition, partially attributed to the flexibility of DNN models in learning complex patterns of speech signals. This flexibility, however, may lead to serious over-fitting and hence miserable performance degradation in adverse acoustic conditions such as those with high ambient noises. We propose a noisy training approach to tackle this problem: by injecting moderate noises into the training data intentionally and randomly, more generalizable DNN models can be learned. This ‘noise injection’ technique, although known to the neural computation community already, has not been studied with DNNs which involve a highly complex objective function. The experiments presented in this paper confirm that the noisy training approach works well for the DNN model and can provide substantial performance improvement for DNN-based speech recognition.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

57

Unbekannt

SIFT-based local spectrogram image descriptor: a novel feature for robust music identification (2015)

Xiu Zhang; Bilei Zhu; Linwei Li; Wei Li; Xiaoqiang Li; Wei Wang; Peizhong Lu; Wenqiang Zhang

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-02-13

Beschreibung: Music identification via audio fingerprinting has been an active research field in recent years. In the real-world environment, music queries are often deformed by various interferences which typically include signal distortions and time-frequency misalignments caused by time stretching, pitch shifting, etc. Therefore, robustness plays a crucial role in music identification technique. In this paper, we propose to use scale invariant feature transform (SIFT) local descriptors computed from a spectrogram image as sub-fingerprints for music identification. Experiments show that these sub-fingerprints exhibit strong robustness against serious time stretching and pitch shifting simultaneously. In addition, a locality sensitive hashing (LSH)-based nearest sub-fingerprint retrieval method and a matching determination mechanism are applied for robust sub-fingerprint matching, which makes the identification efficient and precise. Finally, as an auxiliary function, we demonstrate that by comparing the time-frequency locations of corresponding SIFT keypoints, the factor of time stretching and pitch shifting that music queries might have experienced can be accurately estimated.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

58

Unbekannt

Noisy training for deep neural networks in speech recognition (2015)

Shi Yin; Chao Liu; Zhiyong Zhang; Yiye Lin; Dong Wang; Javier Tejedor; Thomas Zheng; Yinguo Li

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-01-21

Beschreibung: No description available

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

59

Unbekannt

Within and cross-corpus speech emotion recognition using latent topic model-based features (2015)

Mohit Shah; Chaitali Chakrabarti; Andreas Spanias

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-01-30

Beschreibung: Owing to the suprasegmental behavior of emotional speech, turn-level features have demonstrated a better success than frame-level features for recognition-related tasks. Conventionally, such features are obtained via a brute-force collection of statistics over frames, thereby losing important local information in the process which affects the performance. To overcome these limitations, a novel feature extraction approach using latent topic models (LTMs) is presented in this study. Speech is assumed to comprise of a mixture of emotion-specific topics, where the latter capture emotionally salient information from the co-occurrences of frame-level acoustic features and yield better descriptors. Specifically, a supervised replicated softmax model (sRSM), based on restricted Boltzmann machines and distributed representations, is proposed to learn naturally discriminative topics. The proposed features are evaluated for the recognition of categorical or continuous emotional attributes via within and cross-corpus experiments conducted over acted and spontaneous expressions. In a within-corpus scenario, sRSM outperforms competing LTMs, while obtaining a significant improvement of 16.75% over popular statistics-based turn-level features for valence-based classification, which is considered to be a difficult task using only speech. Further analyses with respect to the turn duration show that the improvement is even more significant, 35%, on longer turns (〉6 s), which is highly desirable for current turn-based practices. In a cross-corpus scenario, two novel adaptation-based approaches, instance selection, and weight regularization are proposed to reduce the inherent bias due to varying annotation procedures and cultural perceptions across databases. Experimental results indicate a natural, yet less severe, deterioration in performance - only 2.6% and 2.7%, thereby highlighting the generalization ability of the proposed features.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

60

Unbekannt

Robust design of Farrow-structure-based steerable broadband beamformers with sparse tap weights via convex optimization (2015)

Tiannan Wang; Huawei Chen

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-07-17

Beschreibung: The Farrow-structure-based steerable broadband beamformer (FSBB) is particularly useful in the applications where sound source of interest may move around a wide angular range. However, in contrast with conventional filter-and-sum beamformer, the passband steerability of FSBB is achieved at the cost of high complexity in structure, i.e., highly increased number of tap weights. Moreover, it has been shown that the FSBB is sensitive to microphone mismatches, and robust FSBB design is of interest to practical applications. To deal with the aforementioned problems, this paper studies the robust design of the FSBB with sparse tap weights via convex optimization by considering some a priori knowledge of microphone mismatches. It is shown that although the worst-case performance (WCP) optimization has been successfully applied to the design of robust filter-and-sum beamformers with bounded microphone mismatches, it may become unapplicable to robust FSBB design due to its over-conservativeness nature. When limited knowledge of mean and variance of microphone mismatches is available, a robust FSBB design approach based on the worst-case mean performance optimization with the passband response variance (PRV) constraint is devised. Unlike the WCP optimization design, this approach performs well with the capability of passband stability control of array response. Finally, the robust FSBB design with sparse tap weights has been studied. It is shown that there is redundancy in the tap weights of FSBB, i.e., robust FSBB design with sparse tap weights is viable, and thus leads to low-complexity FSBB.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

61

Unbekannt

Comparative Study of Digital Audio Steganography Techniques (2013)

fatiha djebbarbeghda ayad

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2013-02-21

Beschreibung: The rapid spread in digital data usage in many real life applications have urged new and effective ways to ensure their security. Efficient secrecy can be achieved, at least in part, by implementing steganograhy techniques. Novel and versatile audio steganographic methods have been proposed. The goal of steganographic systems is to obtain secure and robust way to conceal high rate of secret data. We focus in this paper on digital audio steganography, which has emerged as a prominent source of data hiding across novel telecommunication technologies such as covered voice-over-IP, audio conferencing, etc. The multitude of steganographic criteria has led to a great diversity in these system design techniques. In this paper, we review current digital audio steganographic techniques and we evaluate their performance based on robustness, security and hiding capacity indicators. Another contribution of this paper is the provision of a robustness-based classification of steganographic models depending on their occurrence in the embedding process. A survey of major trends of audio steganography applications is also discussed in this paper.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

62

Unbekannt

A Space-Time Coding Scheme for RFID MIMO Systems (2012)

Feng Zheng; Thomas Kaiser

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2012-12-11

Beschreibung: This paper discusses the space-time coding (STC) problem for RFID MIMO systems. First, a mathematical model for this kind of system is developed from the viewpoint of signal processing, which makes it easy to design the STC schemes. Then two STC schemes, namely Scheme I and Scheme II, are proposed. Simulation results illustrate that the proposed approaches can greatly improve the symbol-error rate (SER) performance of RFID systems, compared to the non space-time encoded RFID system. The SER performance for Scheme I and Scheme II is thoroughly compared. It is found that Scheme II with the innate real-symbol constellation yields better SER performance than Scheme I. Some design guidelines for RFID-MIMO systems are pointed out.

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

63

Unbekannt

A generative-oriented model-driven design environment for customizable video surveillance systems (2012)

Nuno Cardoso; Pedro Rodrigues; João Vale; Paulo Garcia; Paulo Cardoso; João Monteiro; Jorge Cabral; José Mendes; Mongkol Ekpanyapong; Adriano Tavares

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2012-12-11

Beschreibung: To tackle the growing complexity and huge demand for tailored domestic video surveillance systems along with a high demanding time-to-market expectation, engineers at IVV Automation, LDA are exploiting video surveillance domain as families of systems that can be developed following a pay-as-you-go fashion rather than developing an ex-nihilo new product. Several and different new functionalities are required for each new product's hardware platforms (e.g., ranging from mobile phone, PDA to desktop PC) and operating systems (e.g., flavors of Linux, Windows and MAC OS X). Some of these functionalities have special economical constraints of speed and footprint. To better accommodate all the above listing requirements, a model-driven generative software development paradigm supported by mainstream tools is proposed to offer a significant leverage in hiding commonalities and configuring variabilities across families of video surveillance products while maintaining the new product quality.

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

64

Unbekannt

Accurate energy characterization of OS services in embedded systems (2012)

Bassem Ouni; Cécile Belleudy; Eric Senn

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2012-12-11

Beschreibung: As technology scales for increased circuit density and performance, the management of power consumption inembedded systems is becoming critical. Because the operating system (OS) is a basic component of the embedded system, the reduction and characterization of its energy consumption is a main challenge for the designers. In this work, a flow of low power OS energy characterization is introduced. The variation of the energy and powerconsumption of the embedded OS services is studied. The remainder of this article details the methods used todetermine energy and power overheads of a set of basic services of the embedded OS: scheduling, context switchand inter-process communication. The impact of hardware and software parameters like processor frequency andscheduling policy on the energy consumption are analyzed. Also, models and laws of the power and energy areextracted. Then, to quantify the low power OS energetic overhead, the obtained models are integrated in thesystem level design. Our method allows estimating the energy consumption of the low power OS services whenrunning an application on a specific hardware platform.

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

65

Unbekannt

Real time simultaneous localization and mapping: towards low-cost multiprocessor embedded systems (2012)

Bastien Vincke; Abdelhafid Elouardi; Lambert Alain

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2012-12-11

Beschreibung: Simultaneous Localization And Mapping (SLAM) is a technique widely used by autonomous robots operating in unknown environments. Research community has developed numerous SLAM algorithms in the last ten years. Several works have presented many algorithms optimizations. However, they have not explored a system optimization from the system hardware architecture to the algorithmic development level. New computing technologies (SIMD coprocessors, DSP, multi-cores) can greatly accelerate the system processing but require rethinking the algorithm implementation. This paper presents an efficient implementation of the EKF-SLAM algorithm on a multi-processor architecture. The algorithm-architecture adequacy aims to optimize the implementation of the SLAM algorithm on a low-cost and heterogeneous architecture (implementing an ARM processor with SIMD coprocessor and a DSP core). Experiments were conducted with an instrumented platform. Results aim to demonstrate that an optimized implementation of the algorithm, resulting from an optimization methodology, can help to design embedded systems implementing low-cost multiprocessor architecture operating under real time constraints.

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

66

Unbekannt

A router for the containment of timing and value failures in CAN (2012)

Roland Kammerer; Roman Obermaisser; Bernhard Frömel

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2012-12-11

Beschreibung: The dependability deficiencies and bandwidth constraints of the controller area network (CAN) can prevent its use in safety-relevant and performance-demanding applications. This paper introduces mechanisms for fault detection and fault isolation based on an intelligent CAN router, which exploits a priori knowledge about the permitted behavior of attached electronic control units (ECUs) in order to detect and contain failures. Experiments using an FPGA-based implementation of the CAN router evaluate these mechanisms under different failure modes (e.g., timing failures, masquerading failures). Due to its compatibility to the CAN standard, the router can improve the dependability and performance of systems with existing ECUs. In addition, we extend the application areas of CAN to systems with higher performance and dependability requirements than can be supported with a conventional bus-based network.

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

67

Unbekannt

Implementation of a recongurable ASIP for high throughput low power DFT/DCT/FIR engine (2012)

Hanan Hassan; Karim Mohammed; Ahmed Shalash

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2012-12-11

Beschreibung: In this article we present an ASIP design for a discrete fourier transform (DFT)/discrete cosine transform (DCT)/finite impulse response filters (FIR) engine. The engine is intended for use in an accelerator-chain implementation of wireless communication systems. The engine offers a very high degree of flexibility, accepting and accelerating performance approaches that of any-number DFT and inverse discrete fourier transform, one and two dimension DCT, and even general implementations of FIR equations. Performance approaches that of dedicated implementations of such algorithms. A customized yet flexible redundant memory map allows processor-like access while maintaining the pipeline full in a dedicated architecture-like manner. The engine is supported by a proprietary software tool that automatically sets the rounding pattern for the accelerator rounder to maintain a required signal to quantization noise or output RMS for any given algorithm. Programming of the processor is done through a mid-level language that combines register-specific instructions with DFT/DCT/FIR specific-instructions. Overall the engine allows users to program a very wide range of applications with software-like ease, while delivering performance very close to hardware. This puts the engine in an excellent spot in the current wireless communications environment with its profusion of multi-mode and emerging standards.

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

68

Unbekannt

Embedded reconfigurable synchronization & acquisition ASIP for a multi-standard OFDM receiver (2012)

Mahmoud Said; Omar Nasr; Ahmed Shalash

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2012-12-11

Beschreibung: Embedded reconfigurable architectures are currently attracting increasing attention in the wireless communications industry due to the escalating number of wireless standards in today's market. Application specific instruction-set processors (ASIPs) present a reconfigurable solution that offers a compromise between programmability and low power consumption. In this article, the design and implementation of an embedded synchronization and acquisition ASIP for OFDM based systems is proposed. The engine architecture is presented and the programming model is explained in details. The proposed engine is scalable and it can be configured to support a multitude of synchronization algorithms and OFDM standards. While applicable to many OFDM systems, the proposed architecture was successfully verified on long term evolution (LTE Rel. 8) and WiMAX 802.16e systems. A partial list of synchronization and acquisition algorithms are tested on the engine for the two standards, and the results highlight the capabilities of the engine. The processor has been synthesized with 0.18μm standard cell CMOS library. It is estimated to occupy 1.1 mm2 and the projected power consumption is 7.9mW at 120 MHz, which meets the speed requirements of the tested standards. More results are included within the article.

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

69

Unbekannt

EURASIP Journal on Embedded Systems now publishing with SpringerOpen (2012)

Zoran Salcic

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2012-12-11

Beschreibung: As the Editor-in-Chief, it is my pleasure to open this new Chapter in the development of EURASIP Journal on Embedded Systems.

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

70

Unbekannt

Estimation and quantization of ICC-dependent phase parameters for parametric stereo audio coding (2012)

Dong-il Hyun; Young-cheol Park; Dae Hee Youn

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2012-12-11

Beschreibung: Conventional parametric stereo (PS) audio coding employs inter-channel phase difference and overall phase difference as phase parameters. In this article, it is shown that those parameters cannot correctly represent the phase relationship between the stereo channels when inter-channel correlation (ICC) is less than one, which is common in practical situations. To solve this problem, we introduce new phase parameters, channel phase differences (CPDs), defined as the phase differences between the mono downmix and the stereo channels. Since CPDs have a descriptive relationship with ICC as well as inter-channel intensity difference, they are more relevant to represent the phase difference between the channels in practical situations. We also propose methods of synthesizing CPDs at the decoder. Through computer simulations and subjective listening tests, it is confirmed that the proposed methods produce significantly lower phase errors than conventional PS, and it can noticeably improve sound quality for stereo inputs with low ICCs.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

71

Unbekannt

Stereophonic hands-free communication system based on microphone array fixed beamforming: real-time implementation and evaluation (2012)

Matteo Pirro; Stefano Squartini; Laura Romoli; Francesco Piazza

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2012-12-11

Beschreibung: In this paper, the authors propose an optimally designed fixed Beamformer (BF) for Stereophonic Acoustic Echo Cancellation (SAEC) in real hands-free communication applications. Several contributions related to the combination of beamforming and echo cancellation have appeared in the literature so far, but, up to the authors' knowledge, the idea of using optimal fixed BFs in a real-time SAEC system both for echo reduction and stereophonic audio rendering is firstly addressed in this contribution. The employment of such designed BFs allows positively addressing both issues, as the several simulated and real tests seem to confirm. In particular, the endorsement of audio stereo-recording quality attainable through the proposed approach has been preliminarily evaluated by means of subjective listening tests. Moreover, the overall system robustness against microphone array imperfections and noise presence has been experimentally evaluated. This allowed the authors to implement a real hands-free communication system in which the usage of the proposed beamforming technique has proved its superiority with respect to the usual two-microphone one in terms of echo reduction, and guaranteeing a comparable spaciousness effect.Moreover, the proposed framework requires a low computational cost increment with regard to the baseline approach, since only few extra filtering operations with short filters need to be executed. Nevertheless, according to the performed simulations, the BF-based SAEC configuration seems not to necessitate of the signal decorrelation module, resulting in an overall computational saving.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

72

Unbekannt

Comparative Study of Digital Audio Steganography Techniques (2012)

fatiha djebbarbeghda ayad

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2012-12-11

Beschreibung: The rapid spread in digital data usage inmany real life applications have urged new and effectiveways to ensure their security. Efficient secrecy can beachieved, at least in part, by implementing steganograhytechniques. Novel and versatile audio steganographicmethods have been proposed. The goal of steganographicsystems is to obtain secure and robust way to conceal highrate of secret data. We focus in this paper on digitalaudio steganography, which has emerged as a prominentsource of data hiding across novel telecommunicationtechnologies such as covered voice-over-IP, audioconferencing, etc. The multitude of steganographiccriteria has led to a great diversity in these system designtechniques. In this paper, we review current digitalaudio steganographic techniques and evaluate theirperformance based on robustness, security and hidingcapacity indicators. Another contribution of this paperis the provision of a robustness-based classification ofsteganographic models depending on their occurrencein the embedding process. A survey of major trends ofaudio steganography applications is also discussed inthis paper.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

73

Unbekannt

Expressed music mood classification compared with valence and arousal ratings (2012)

Albertus den Brinker; Ralph van Dinther; Janto Skowronek

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2012-12-11

Beschreibung: Mood is an important aspect of music and knowledge of mood can be used as a basic feature in music recommender and retrieval systems. A listening experiment was carried out establishing ratings for various moods and a number of attributes, e.g., valence and arousal. The analysis of these data covers the issues of the number of basic dimensions in music mood, their relation to valence and arousal, the distribution of moods in the valence-arousalplane, distinctiveness of the labels, and appropriate (number of) labels for full coverage of the plane. It is also shown that subject-averaged valence and arousal ratings can be predicted from music features by a linear model.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

74

Unbekannt

An evolutionary feature synthesis approach for content-based audio retrieval (2012)

Toni Mäkinen; Serkan Kiranyaz; Jenni Raitoharju; Moncef Gabbouj

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2012-12-11

Beschreibung: A vast amount of audio features have been proposed in the literature to characterize the content of audio signals. In order to overcome specific problems related to the existing features (such as lack of discriminative power), as well as to reduce the need for manual feature selection, in this article, we propose an evolutionary feature synthesis technique with a built-in feature selection scheme. The proposed synthesis process searches for optimal linear/nonlinear operators and feature weights from a pre-defined multi-dimensional search space to generate a highly discriminative set of new (artificial) features. The evolutionary search process is based on a stochastic optimization approach in which a multi-dimensional particle swarm optimization algorithm, along with fractional global best formation and heterogeneous particle behavior techniques, is applied. Unlike many existing feature generation approaches, the dimensionality of the synthesized feature vector is also searched and optimized within a set range in order to better meet the varying requirements set by many practical applications and classifiers. The new features generated by the proposed synthesis approach are compared with typical low-level audio features in several classification and retrieval tasks. The results demonstrate a clear improvement of up to 15--20% in average retrieval performance. Moreover, the proposed synthesis technique surpasses the synthesis performance of evolutionary artificial neural networks, exhibiting a considerable capability to accurately distinguish among different audio classes.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

75

Unbekannt

Biomimetic Multi-Resolution Analysis for Robust Speaker Recognition (2012)

Sridhar Nemala; Dmitry Zotkin; Ramani Duraiswami; Mounya Elhilali

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2012-12-11

Beschreibung: Humans exhibit a remarkable ability to reliably classify sound sources in the environment even in presence of high levels of noise. In contrast, most engineering systems suffer a drastic drop in performance when speech signals are corrupted with channel or background distortions. Our brains are equippedwith elaborate machinery for speech analysis and feature extraction, understanding of which would presumably improve the performance of automatic speech processing systems under adverse conditions. The work presented here explores a biologically-motivated multi-resolution speaker informationrepresentation obtained by performing an intricate yet computationally-efficient analysis of the information-rich spectro-temporal attributes of the speech signal. We evaluate the proposed features in a speaker verification task performed on NIST SRE 2010 data. The biomimetic approach yields significant robustness in presence of non-stationary noise and reverberation, offering a new framework for deriving reliable features for speaker recognition and speech processing.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

76

Unbekannt

Pseudorandom recursions II (2012)

Laszlo Hars; Gyorgy Petruska

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2012-12-11

Beschreibung: We present our earlier results (not included in Hars and Petruska due to space and time limitations), as well as some updated versions of those, and a few more recent pseudorandom number generator designs. These tell a systems designer which computer word lengths are suitable for certain high-quality pseudorandom number generators, and which constructions of a large family of designs provide long cycles, the most important property of such generators. The employed mathematical tools could help assessing the bit-mixing and mapping properties of a large class of iterated functions, performing only non-multiplicative computer operations: SHIFT, ROTATE, ADD, and XOR.

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

77

Unbekannt

A framework for modeling and simulating energy harvesting WSN nodes with efficient power management policies (2012)

Andrea Castagnetti; Alain Pegatoquet; Cécile Belleudy; Michel Auguin

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2012-12-11

Beschreibung: Wireless Sensor Networks (WSNs) require an extremely energy-efficient design. As sensor nodes carry limited power sources, the problem of autonomy is crucial. Energy harvesting provides a potential solution to this problem. However, as current energy harvesters produce only a small amount of energy and the storage capacity is limited, efficient power management techniques must also be considered. In this article we address the problem of modeling and simulating energy harvesting WSN nodes with efficient power management policies. We propose for that a framework that permits to describe and simulate an energy harvesting sensor node. A high level modeling approach based on the power consumption and the energy harvesting is proposed. The node architectural parameters as well as the on-line power management techniques can also be specified. Two novel power management architectures are then introduced taking into account energy-neutral and negative-energy conditions.Simulations results show that they can improve the throughput of a sensor node of about 50% compared to a state of the art power management algorithm for solar harvesting WSN. The simulation framework is then used to find an efficient system sizing for a solar energy harvesting WSN node.

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

78

Unbekannt

A parameterizable spatiotemporal representation of popular dance styles for humanoid dancing characters (2012)

João Oliveira; Luiz Naveda; Fabien Gouyon; Luis Reis; Paulo Sousa; Marc Leman

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2012-12-11

Beschreibung: Dance movements are a complex class of human behavior which convey forms of non-verbal and subjective communication that are performed as cultural vocabularies in all human cultures. The singularity of dance forms imposes fascinating challenges to computer animation and robotics, which in turn presents outstanding opportunities to deepen our understanding about the phenomenon of dance by means of developing models, analyses and syntheses of motion patterns. In this article, we formalize a model for the analysis and representation of popular dance styles of repetitive gestures by specifying the parameters and validation procedures necessary to describe the spatiotemporal elements of the dance movement in relation to its music temporal structure (musical meter). Our representation model is able to precisely describe the structure of dance gestures according to the structure of musical meter, at different temporal resolutions, and is flexible enough to convey the variability of the spatiotemporal relation between music structure and movement in space. It results in a compact and discrete mid-level representation of the dance that can be further applied to algorithms for the generation of movements in different humanoid dancing characters. The validation of our representation model relies upon two hypotheses: (i) the impact of metric resolution and (ii) the impact of variability towards fully and naturally representing a particular dance style of repetitive gestures. We numerically and subjectively assess these hypotheses by analyzing solo dance sequences of Afro-Brazilian samba and American Charleston, captured with a MoCap (Motion Capture) system. From these analyses, we build a set of dance representations modeled with different parameters, and re-synthesize motion sequence variations of the represented dance styles. For specifically assessing the metric hypothesis, we compare the captured dance sequences with repetitive sequences of a fixed dance motion pattern, synthesized at different metric resolutions for both dance styles. In order to evaluate the hypothesis of variability, we compare the same repetitive sequences with others synthesized with variability, by generating and concatenating stochastic variations of the represented dance pattern. The observed results validate the proposition that different dance styles of repetitive gestures might require a minimum and sufficient metric resolution to be fully represented by the proposed representation model. Yet, these also suggest that additional information may be required to synthesize variability in the dance sequences while assuring the naturalness of the performance. Nevertheless, we found evidence that supports the use of the proposed dance representation for flexibly modeling and synthesizing dance sequences from different popular dance styles, with potential developments for the generation of expressive and natural movement profiles onto humanoid dancing characters.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

79

Unbekannt

Speaker diarization of broadcast news in Albayzin 2010 evaluation campaign (2012)

Martin Zelenák; Henrik Schulz; Javier Hernando

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2012-12-11

Beschreibung: In this article, we present the evaluation results for the task of speaker diarization of broadcast news, which was part of the Albayzin 2010 evaluation campaign of language and speech technologies. The evaluation data consists of a subset of the Catalan broadcast news database recorded from the 3/24 TV channel. The description of five submitted systems from five different research labs is given, marking the common as well as the distinctive system features. The diarization performance is analyzed in the context of the diarization error rate, the number of detected speakers and also the acoustic background conditions. An effort is also made to put the achieved results in relation to the particular system design features.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

80

Unbekannt

Speaker-Dependent Model Interpolation for Statistical Emotional Speech Synthesis (2012)

Chih-Yu Hsu; Chia-Ping Chen

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2012-12-11

Beschreibung: In this paper, we propose a speaker-dependent model interpolation method for statistical emotional speech synthesis. The basic idea is to combine the neutral model set of the target speaker and an emotional model set selected from a pool of speakers. For model selection and interpolation weight determination, we propose to use a novel monophone-based Mahalanobis distance, which is a proper distancemeasure between two Hidden Markov Model sets. We design Latin-square evaluation to reduce the systematic bias in the subjective listening tests. The proposed interpolation method achieves sound performance on the emotional expressiveness, the naturalness, and the target speaker similarity. Moreover, such performance is achieved without the need to collect the emotional speech of thetarget speaker, saving the cost of data collection and labeling.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

81

Unbekannt

Speech steganography using wavelet and Fourier transforms (2012)

Siwar Rekik; Driss Guerchi; Sid-Ahmed Selouani; Habib Hamam

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2012-12-11

Beschreibung: A new method to secure speech communication using the discrete wavelet transforms (DWT) and the fast Fourier transform is presented in this article. In the first phase of the hiding technique, we separate the speech high-frequency components from the low-frequency components using the DWT. In a second phase, we exploit the low-pass spectral proprieties of the speech spectrum to hide another secret speech signal in the low-amplitude high-frequency regions of the cover speech signal. The proposed method allows hiding a large amount of secret information while rendering the steganalysis more complex. Experimental results prove the efficiency of the proposed hiding technique since the stego signals are perceptually indistinguishable from the equivalent cover signal, while being able to recover the secret speech message with slight degradation in the quality.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

82

Unbekannt

A comprehensive system for facial animation of generic 3D head models driven by speech (2013)

Lucas Terissi; Mauricio Cerda; Juan Gómez; Nancy Hitschfeld-Kahler; Bernard Girau

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2013-02-02

Beschreibung: A comprehensive system for facial animation of generic 3D head models driven by speech is presentedin this article. In the training stage, audio-visual information is extracted from audio-visualtraining data, and then used to compute the parameters of a single joint audio-visual hidden Markovmodel (AV-HMM). In contrast to most of the methods in the literature, the proposed approach doesnot require segmentation/classification processing stages of the audio-visual data, avoiding the errorpropagation related to these procedures. The trained AV-HMM provides a compact representation ofthe audio-visual data, without the need of phoneme (word) segmentation, which makes it adaptableto different languages. Visual features are estimated from the speech signal based on the inversionof the AV-HMM. The estimated visual speech features are used to animate a simple face model. Theanimation of a more complex head model is then obtained by automatically mapping the deformationof the simple model to it, using a small number of control points for the interpolation. The proposedalgorithm allows the animation of 3D head models of arbitrary complexity through a simple setupprocedure. The resulting animation is evaluated in terms of intelligibility of visual speech throughperceptual tests, showing a promising performance. The computational complexity of the proposedsystem is analyzed, showing the feasibility of its real-time implementation.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

83

Unbekannt

Speech enhancement based on Bayesian decision and spectral amplitude estimation (2015)

Feng Deng; Chang-Chun Bao

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-10-08

Beschreibung: In this paper, a single-channel speech enhancement method based on Bayesian decision and spectral amplitude estimation is proposed, in which the speech detection module and spectral amplitude estimation module are included, and the two modules are strongly coupled. First, under the decisions of speech presence and speech absence, the optimal speech amplitude estimators are obtained by minimizing a combined Bayesian risk function, respectively. Second, using the obtained spectral amplitude estimators, the optimal speech detector is achieved by further minimizing the combined Bayesian risk function. Finally, according to the detection results of speech detector, the optimal decision rule is made and the optimal spectral amplitude estimator is chosen for enhancing noisy speech. Furthermore, by considering both detection and estimation errors, we propose a combined cost function which incorporates two general weighted distortion measures for the speech presence and speech absence of the spectral amplitudes, respectively. The cost parameters in the cost function are employed to balance the speech distortion and residual noise caused by missed detection and false alarm, respectively. In addition, we propose two adaptive calculation methods for the perceptual weighted order p and the spectral amplitude order β concerned in the proposed cost function, respectively. The objective and subjective test results indicate that the proposed method can achieve a more significant segmental signal-noise ratio (SNR) improvement, a lower log-spectral distortion, and a better speech quality than the reference methods.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

84

Unbekannt

Detecting fingering of overblown flute sound using sparse feature learning (2016)

Yoonchang Han; Kyogu Lee

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-01-22

Beschreibung: In woodwind instruments such as a flute, producing a higher-pitched tone than a standard tone by increasing the blowing pressure is called overblowing, and this allows several distinct fingerings for the same notes. This article presents a method that attempts to learn acoustic features that are more appropriate than conventional features such as mel-frequency cepstral coefficients (MFCCs) in detecting the fingering from a flute sound using unsupervised feature learning. To do so, we first extract a spectrogram from the audio and convert it to a mel scale. Then, we concatenate four consecutive mel-spectrogram frames to include short temporal information and use it as a front end for the sparse filtering algorithm. The learned feature is then max-pooled, resulting in a final feature vector for the classifier that has extra robustness. We demonstrate the advantages of the proposed method in a twofold manner: we first visualize and analyze the differences in the learned features between the tones generated by standard and overblown fingerings. We then perform a quantitative evaluation through classification tasks on six selected pitches with up to five different fingerings that include a variety of octave-related and non-octave-related fingerings. The results confirm that the learned features using the proposed method significantly outperform the conventional MFCCs and the residual noise spectrum in every experimental condition for the classification tasks.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

85

Unbekannt

Grid-based approximation for voice conversion in low resource environments (2016)

Hadas Benisty; David Malah; Koby Crammer

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-01-23

Beschreibung: The goal of voice conversion is to modify a source speaker’s speech to sound as if spoken by a target speaker. Common conversion methods are based on Gaussian mixture modeling (GMM). They aim to statistically model the spectral structure of the source and target signals and require relatively large training sets (typically dozens of sentences) to avoid over-fitting. Moreover, they often lead to muffled synthesized output signals, due to excessive smoothing of the spectral envelopes.Mobile applications are characterized with low resources in terms of training data, memory footprint, and computational complexity. As technology advances, computational and memory requirements become less limiting; however, the amount of available training data still presents a great challenge, as a typical mobile user is willing to record himself saying just few sentences. In this paper, we propose the grid-based (GB) conversion method for such low resource environments, which is successfully trained using very few sentences (5–10). The GB approach is based on sequential Bayesian tracking, by which the conversion process is expressed as a sequential estimation problem of tracking the target spectrum based on the observed source spectrum. The converted Mel frequency cepstrum coefficient (MFCC) vectors are sequentially evaluated using a weighted sum of the target training vectors used as grid points. The training process includes simple computations of Euclidian distances between the training vectors and is easily performed even in cases of very small training sets.We use global variance (GV) enhancement to improve the perceived quality of the synthesized signals obtained by the proposed and the GMM-based methods. Using just 10 training sentences, our enhanced GB method leads to converted sentences having closer GV values to those of the target and to lower spectral distances at the same time, compared to enhanced version of the GMM-based conversion method. Furthermore, subjective evaluations show that signals produced by the enhanced GB method are perceived as more similar to the target speaker than the enhanced GMM signals, at the expense of a small degradation in the perceived quality.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

86

Unbekannt

Grid-based approximation for voice conversion in low resource environments (2016)

Hadas Benisty, David Malah and Koby Crammer

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-02-24

Beschreibung: The goal of voice conversion is to modify a source speaker’s speech to sound as if spoken by a target speaker. Common conversion methods are based on Gaussian mixture modeling (GMM). They aim to statistically ...

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

87

Unbekannt

Embedded mobile crowd service systems based on opportunistic geological grid and dynamical segmentation (2016)

Chunhua Dong, Li Wang and Kunming Zhao

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-02-24

Beschreibung: In order to solve these problems such as the demand of geographic information service and the short life of the embedded system, as well as network collapse, and so on, the embedded mobile crowd service system...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

88

Unbekannt

Developing a unit selection voice given audio without corresponding text (2016)

Tejas Godambe, Sai Krishna Rallabandi, Suryakanth V. Gangashetty, Ashraf Alkhairy and Afshan Jafri

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-03-02

Beschreibung: Today, a large amount of audio data is available on the web in the form of audiobooks, podcasts, video lectures, video blogs, news bulletins, etc. In addition, we can effortlessly record and store audio data s...

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

89

Unbekannt

Cloud-assisted QoE guarantee scheme based on adaptive cross-layer perceptron of artificial neural network for mobile Internet (2016)

Zhou Silin

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-02-24

Beschreibung: For improving the system performance of mobile Internet, how to provide the Quality of Experience (QoE) guarantee is an important factor. First, based on artificial neural network and adaptive cross-layer perc...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

90

Unbekannt

Comparison of ALBAYZIN query-by-example spoken term detection 2012 and 2014 evaluations (2016)

Javier Tejedor, Doroteo T. Toledano, Paula Lopez-Otero, Laura Docio-Fernandez and Carmen Garcia-Mateo

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-02-24

Beschreibung: Query-by-example spoken term detection (QbE STD) aims at retrieving data from a speech repository given an acoustic query containing the term of interest as input. Nowadays, it is receiving much interest due t...

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

91

Unbekannt

Dedicated object processor for mobile augmented reality - sailor assistance case study (2016)

Jean-Philippe Diguet, Neil Bergmann and Jean-Christophe Morgère

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-02-24

Beschreibung: This paper addresses the design of embedded systems for outdoor augmented reality (AR) applications integrated to see-through glasses. The set of tasks includes object positioning, graphic computation, as well...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

92

Unbekannt

Mobile service aware opportunistic embedded architecture of mobile crowd sensing networks for power network measurement (2016)

Jianwei Zhang and Hao Yang

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-02-24

Beschreibung: In order to improve the intelligent degree and robustness optimization of power grid management system, the opportunistic embedded architecture was proposed for power network measurement with mobile service aw...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

93

Unbekannt

Electronic commerce recommendation mobile crowd system based on cooperative data collection and embedded control (2016)

Juan Wang, Wen-Min Deng and Xing-Yue Yin

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-02-25

Beschreibung: It is known that the collection of the specific needs of mobile users and location management in an electronic commerce recommendation system are important indicators used to evaluate user satisfaction and sys...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

94

Unbekannt

Symbolic execution and timed automata model checking for timing analysis of Java real-time systems (2016)

Kasper S. Luckow, Corina S. Păsăreanu and Bent Thomsen

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-02-24

Beschreibung: This paper presents SymRT, a tool based on a combination of symbolic execution and real-time model checking for timing analysis of Java systems. Symbolic execution is used for the generation of a safe and tight t...

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

95

Unbekannt

iSargam: music notation representation for Indian Carnatic music (2016)

Stanly Mammen, Ilango Krishnamurthi, A. Jalaja Varma and G. Sujatha

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-02-24

Beschreibung: Indian classical music, including its two varieties, Carnatic and Hindustani music, has a rich music tradition and enjoys a wide audience from various parts of the world. The Carnatic music which is more popul...

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

96

Unbekannt

Hybrid statistical/unit-selection Turkish speech synthesis using suffix units (2016)

Cenk Demiroğlu and Ekrem Güner

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-02-24

Beschreibung: Unit selection based text-to-speech synthesis (TTS) has been the dominant TTS approach of the last decade. Despite its success, unit selection approach has its disadvantages. One of the most significant disadv...

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

97

Unbekannt

Detecting fingering of overblown flute sound using sparse feature learning (2016)

Yoonchang Han and Kyogu Lee

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-02-24

Beschreibung: In woodwind instruments such as a flute, producing a higher-pitched tone than a standard tone by increasing the blowing pressure is called overblowing, and this allows several distinct fingerings for the same ...

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

98

Unbekannt

Hybrid statistical/unit-selection Turkish speech synthesis using suffix units (2016)

Cenk Demiro¿lu; Ekrem Güner

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-02-03

Beschreibung: Unit selection based text-to-speech synthesis (TTS) has been the dominant TTS approach of the last decade. Despite its success, unit selection approach has its disadvantages. One of the most significant disadvantages is the sudden discontinuities in speech that distract the listeners (Speech Commun 51:1039–1064, 2009). The second disadvantage is that significant expertise and large amounts of data is needed for building a high-quality synthesis system which is costly and time-consuming. The statistical speech synthesis (SSS) approach is a promising alternative synthesis technique. Not only that the spurious errors that are observed in the unit selection system are mostly not observed in SSS but also building voice models is far less expensive and faster compared to the unit selection system. However, the resulting speech is typically not as natural-sounding as speech that is synthesized with a high-quality unit selection system. There are hybrid methods that attempt to take advantage of both SSS and unit selection systems. However, existing hybrid methods still require development of a high-quality unit selection system. Here, we propose a novel hybrid statistical/unit selection system for Turkish that aims at improving the quality of the baseline SSS system by improving the prosodic parameters such as intonation and stress. Commonly occurring suffixes in Turkish are stored in the unit selection database and used in the proposed system. As opposed to existing hybrid systems, the proposed system was developed without building a complete unit selection synthesis system. Therefore, the proposed method can be used without collecting large amounts of data or utilizing substantial expertise or time-consuming tuning that is typically required in building unit selection systems. Listeners preferred the hybrid system over the baseline system in the AB preference tests.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

99

Unbekannt

Mobile service aware opportunistic embedded architecture of mobile crowd sensing networks for power network measurement (2016)

Jianwei Zhang; Hao Yang

Springer

In: EURASIP Journal on Embedded Systems

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-02-04

Beschreibung: In order to improve the intelligent degree and robustness optimization of power grid management system, the opportunistic embedded architecture was proposed for power network measurement with mobile service aware scheme. First, the mobile crowd sensing network for power grid management was proposed to realize the intelligent power grid management. Then, we designed the mobile service aware opportunistic embedded system based on the requirements of intelligent power grid management and deployment of mobile crowd sensing network. Thirdly, the grid of embedded systems was demonstrated for intelligent management. The experimental results show that the proposed scheme has obvious advantages in system complexity, execution efficiency, intelligent power grid management level, etc.

Print ISSN: 1687-3955

Digitale ISSN: 1687-3963

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik , Informatik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

100

Unbekannt

Albayzín-2014 evaluation: audio segmentation and classification in broadcast news domains (2015)

Diego Castán; David Tavarez; Paula Lopez-Otero; Javier Franco-Pedroso; Héctor Delgado; Eva Navas; Laura Docio-Fernández; Daniel Ramos; Javier Serrano; Alfonso Ortega; Eduardo Lleida

Springer

In: EURASIP Journal on Audio, Speech, and Music Processing

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-12-02

Beschreibung: Audio segmentation is important as a pre-processing task to improve the performance of many speech technology tasks and, therefore, it has an undoubted research interest. This paper describes the database, the metric, the systems and the results for the Albayzín-2014 audio segmentation campaign. In contrast to previous evaluations where the task was the segmentation of non-overlapping classes, Albayzín-2014 evaluation proposes the delimitation of the presence of speech, music and/or noise that can be found simultaneously. The database used in the evaluation was created by fusing different media and noises in order to increase the difficulty of the task. Seven segmentation systems from four different research groups were evaluated and combined. Their experimental results were analyzed and compared with the aim of providing a benchmark and showing up the promising directions in this field.

Print ISSN: 1687-4714

Thema: Elektrotechnik, Elektronik, Nachrichtentechnik

Publiziert von Springer

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext