# Jean Hennebert

Warning: Invalid argument supplied for foreach() in /opt/lampp/htdocs/icosys/wp-content/plugins/papercite/papercite.classes.php on line 72

Warning: Invalid argument supplied for foreach() in /opt/lampp/htdocs/icosys/wp-content/plugins/papercite/papercite.classes.php on line 72
Personal page of Jean
Function/Title
Prof. Dr. in Machine Learning
Dir. iCoSys
Coord. MSE Data Science
Contact
+41 26 429 65 96
Main Skills
Expertises:
• AI
• Machine Learning
• Deep Learning
Human Languages:
• French
• English
• German
Computer Languages:
• Python
• Java
Technologies:
• Sklearn
• Tensorflow
• Pytorch
Dr. Jean Hennebert is full professor in Machine Learning at the University of Applied Sciences of Western Switzerland (HES-SOHEIA-FR). He is leading the Master Programme MSE in Data Science at HES-SO and at the national level (profile coordinator). He is also director of the institute of Complex Systems where he manages research activities in domains related to applied Machine Learning and complex Information Systems. Dr. Jean Hennebert teaches computer science classes on machine learning, deep learning, software development and IT project management. Dr. Jean Hennebert is also appointed with the Department of Informatics of the University of Fribourg as lecturer and PhD supervisor.

Dr. Jean Hennebert has been working for over 25 years in fundamental and applied research at the intersection of machine learning and business needs. He initiated more than 25 EU and national research projects in the field of machine learning, covering diverse application domains in biomedical applications, biometrics, text processing and multi-dimensional signal processing. He is author and co-author of more than 100 publications and 5 patents in the field of machine learning (h-index>30).

• Ph.D. in Computer Science
1994 / 1998
EPFL – Ecole Polytechnique Fédérale de Lausanne
• MSc. Engineering in Electricity and Telecommunication
1988 / 1993
FPMS Polytechnique Mons, Belgium
• Full Professor – University of Applied Sciences HES-SO
2007 / now
From 07 to 11 with HES-SO//Wallis and from 11 with HES-SO//Fribourg.
• Lecturer
2004 / now
University of Fribourg
• Co-founder and CTO
2000 / 2004
UbiCall Communications, USA / Belgium
• System Architect
1998 / 2000
UBS, Zurich, Switzerland
• Visiting Researcher
1996 / 1997
International Computer Science Institute, Berkeley CA, USA
• Researcher and Ph.D. student
1993 / 1998
EPFL, Lausanne
• Deep Learning – HES-SO and ZHAW MSE, Master level
2018-now
• Machine Learning – HES-SO MSE, Master level
2016-now
• Machine Learning – HES-SO HEIA-FR, Bachelor level
2013-now
• IT Startup Bootcamp – HES-SO HEIA-FR, Bachelor level
2018-now
• Scientific Programming – UNIFR, Bachelor level
2018-now
• Algorithms and Data Structures – HES-SO HEIA-FR, Bachelor level
2011-2016
• IT Project Management – HES-SO HEIA-FR and MSE, Bachelor and Master level
2008-2020
• Frédéric Montet – Machine Learning For Smart Building
running
• Oussama Zayene – Detection and Recognition of Artificial Text in Arabic News Videos
2018
• Baptiste Wicht – Deep Learning for Feature Extraction in Images
2018
• Christophe Gisler – Generic Data-Driven Approaches to Time Series Classification
2017
• Kai Chen – Machine Learning for Automatic Text Segmentation in Historical Documents
2017
• Antonio Ridi – Generative Models for Time Series in the Context of In-Home Monitoring
2016
• Gérôme Bovet – A Scalable and Sustainable Web of Buildings Architecture
2016
• Nayla Sokhn – Structure and Dynamics of Niche-Overlap Graphs
2015
• Fouad Slimane – Low-Resolution Printed Arabic Text Recognition with Hidden Markov Models
2013
• Florian Verdet – Exploring Variabilities through Factor Analysis in Autom. Acoustic Lang. Recog.
2011
• Andreas Humm – Modelling combined Handwriting and Speech Modalities for User Auth.
2008
• TA-Swiss – Steering Committee of the Foundation for Technology Assessment
2018-now
• innosuisse, Reviewer for the Swiss Innovation Agency
2019-now
• SISR, Vice President of the Swiss Society for Informatics – Section Romande
2018-now
• O. Zayene, R. Ingold, N. E. BenAmara, and J. Hennebert, “ICPR2020 Competition on Text Detection and Recognition in Arabic News Video Frames,” in Pattern Recognition. ICPR International Workshops and Challenges: Virtual Event, January 10-15, 2021, Proceedings, Part VIII, 2021, p. 343–356.
[Bibtex]
@inproceedings{zayene2021icpr2020,
title={ICPR2020 Competition on Text Detection and Recognition in Arabic News Video Frames},
author={Zayene, Oussama and Ingold, Rolf and BenAmara, Najoua Essoukri and Hennebert, Jean},
booktitle={Pattern Recognition. ICPR International Workshops and Challenges: Virtual Event, January 10-15, 2021, Proceedings, Part VIII},
pages={343--356},
year={2021},
organization={Springer International Publishing}
}
• L. Linder, F. Montet, J. Hennebert, and J. Bacher, “Big Building Data 2.0-a Big Data Platform for Smart Buildings,” in Journal of Physics: Conference Series, 2021, p. 12016.
[Bibtex]
@inproceedings{linder2021big,
title={Big Building Data 2.0-a Big Data Platform for Smart Buildings},
author={Linder, Lucy and Montet, Fr{\'e}d{\'e}ric and Hennebert, Jean and Bacher, Jean-Philippe},
booktitle={Journal of Physics: Conference Series},
volume={2042},
number={1},
pages={012016},
year={2021},
organization={IOP Publishing}
}
• J. Parrat, J. Bacher, F. Radu, and J. Hennebert, “Rendre visibles les pulsations de la ville,” bulletin.ch, vol. 6, p. 22–26, 2020.
[Bibtex]
@article{parrat2020rendre,
title={Rendre visibles les pulsations de la ville},
author={Parrat, Jonathan and Bacher, Jean-Philippe and Radu, Florinel and Hennebert, Jean},
journal={bulletin.ch},
volume={6},
pages={22--26},
year={2020},
publisher={Electrosuisse et l'Association des entreprises electriques suisses (AES)}
}
• A. Cholleton, A. Fischer, J. Hennebert, V. Raemy, and B. Wicht, Deep neural network generation of domain names, 2020.
[Bibtex]
@misc{cholleton2020deep,
title={Deep neural network generation of domain names},
author={Cholleton, Aubry and Fischer, Andreas and Hennebert, Jean and Raemy, Vincent and Wicht, Baptiste},
year={2020},
month=sep # "~15",
note={US Patent 10,778,640}
}
• B. Wolf, J. Donzallaz, C. Jost, A. Hayoz, S. Commend, J. Hennebert, and P. Kuonen, “Using CNNs to Optimize Numerical Simulations in Geotechnical Engineering,” in Artificial Neural Networks in Pattern Recognition, Cham, 2020, p. 247–256.
[Bibtex]
@InProceedings{10.1007/978-3-030-58309-5_20,
author="Wolf, Beat
and Donzallaz, Jonathan
and Jost, Colette
and Hayoz, Amanda
and Commend, St{\'e}phane
and Hennebert, Jean
and Kuonen, Pierre",
editor="Schilling, Frank-Peter
title="Using CNNs to Optimize Numerical Simulations in Geotechnical Engineering",
booktitle="Artificial Neural Networks in Pattern Recognition",
year="2020",
publisher="Springer International Publishing",
pages="247--256",
abstract="Deep excavations are today mainly designed by manually optimising the wall's geometry, stiffness and strut or anchor layout. In order to better automate this process for sustained excavations, we are exploring the possibility of approximating key values using a machine learning (ML) model instead of calculating them with time-consuming numerical simulations. After demonstrating in our previous work that this approach works for simple use cases, we show in this paper that this method can be enhanced to adapt to complex real-world supported excavations. We have improved our ML model compared to our previous work by using a convolutional neural network (CNN) model, coding the excavation configuration as a set of layers of fixed height containing the soil parameters as well as the geometry of the walls and struts. The system is trained and evaluated on a set of synthetically generated situations using numerical simulation software. To validate this approach, we also compare our results to a set of 15 real-world situations in a t-SNE. Using our improved CNN model we could show that applying machine learning to predict the output of numerical simulation in the domain of geotechnical engineering not only works in simple cases but also in more complex, real-world situations.",
isbn="978-3-030-58309-5",
doi={https://doi.org/10.1007/978-3-030-58309-5_20}
}
• L. Linder, M. Jungo, J. Hennebert, C. C. Musat, and A. Fischer, “Automatic Creation of Text Corpora for Low-Resource Languages from the Internet: The Case of Swiss German,” in Proceedings of The 12th Language Resources and Evaluation Conference, Marseille, France, 2020, p. 2706–2711.
[Bibtex]
@InProceedings{linder2020crawler,
author = {Linder, Lucy and Jungo, Michael and Hennebert, Jean and Musat, Claudiu Cristian and Fischer, Andreas},
title = {Automatic Creation of Text Corpora for Low-Resource Languages from the Internet: The Case of Swiss German},
booktitle = {Proceedings of The 12th Language Resources and Evaluation Conference},
month = {May},
year = {2020},
publisher = {European Language Resources Association},
pages = {2706--2711},
abstract = {This paper presents SwissCrawl, the largest Swiss German text corpus to date. Composed of more than half a million sentences, it was generated using a customized web scraping tool that could be applied to other low-resource languages as well. The approach demonstrates how freely available web pages can be used to construct comprehensive text corpora, which are of fundamental importance for natural language processing. In an experimental evaluation, we show that using the new corpus leads to significant improvements for the task of language modeling.},
url = {https://www.aclweb.org/anthology/2020.lrec-1.329}
}
• L. Rychener, F. Montet, and J. Hennebert, “Architecture Proposal for Machine Learning Based Industrial Process Monitoring,” Procedia Computer Science, vol. 170, p. 648–655, 2020.
[Bibtex]
@article{rychener2020architecture,
title={Architecture Proposal for Machine Learning Based Industrial Process Monitoring},
author={Rychener, Lorenz and Montet, Fr{\'e}d{\'e}ric and Hennebert, Jean},
journal={Procedia Computer Science},
volume={170},
pages={648--655},
year={2020},
publisher={Elsevier},
issn = {1877-0509},
doi = {https://doi.org/10.1016/j.procs.2020.03.137},
url = {http://www.sciencedirect.com/science/article/pii/S1877050920305925},
keywords = {System Architecture, Rule Engine, Anomaly Detection, Monitoring, Industry 4.0},
abstract = {In the context of Industry 4.0, an emerging trend is to increase the reliability of industrial process by using machine learning (ML) to detect anomalies of production machines. The main advantages of ML are in the ability to (1) capture non-linear phenomena, (2) adapt to many different processes without human intervention and (3) learn incrementally and improve over time. In this paper, we take the perspective of IT system architects and analyse the implications of the inclusion of ML components into a traditional anomaly detection systems. Through a prototype that we deployed for chemical reactors, our findings are that such ML components are impacting drastically the architecture of classical alarm systems. First, there is a need for long-term storage of the data that are used to train the models. Second, the training and usage of ML models can be CPU intensive and may request using specific resources. Third, there is no single algorithm that can detect machine errors. Fourth, human crafted alarm rules can now also include a learning process to improve these rules, for example by using active learning with a human-in-the-loop approach. These reasons are the motivations behind a microservice-based architecture for an alarm system in industrial machinery.}
}
• S. Commend, S. Wattel, J. Hennebert, P. Kuonen, and L. Vulliet, “Prediction of unsupported excavations behaviour with machine learning techniques,” in COMPLAS 2019, 2019, pp. 529-535.
[Bibtex]
@InProceedings{commend2019prediction,
author={St{\'{e}}phane Commend and Sacha Wattel and Jean Hennebert and Pierre Kuonen and Laurent Vulliet},
booktitle={COMPLAS 2019},
title={Prediction of unsupported excavations behaviour with machine learning techniques},
year={2019},
pages={529-535},
month={September},
}
• V. Raemy, V. Russo, J. Hennebert, and B. Wicht, “Construction of phonetic representation of a string of characters,” , iss. US9910836B2, 2018.
[Bibtex]
@patent{Raemy2018US9910836B2,
author = {Vincent Raemy and Vincenzo Russo and Jean Hennebert and Baptiste Wicht},
title = {Construction of phonetic representation of a string of characters},
year = {2018},
month = {03},
day = {06},
number = {US9910836B2},
location = {US},
url = {https://worldwide.espacenet.com/publicationDetails/biblio?CC=US&NR=9910836B2&KC=B2&FT=D},
filing_num = {14976968},
yearfiled = {2015},
monthfiled = {12},
dayfiled = {21},
abstract = {Provided are methods, devices, and computer-readable media for accessing a string of characters; parsing the string of characters into string of graphemes; determining one or more phonetic representations for one or more graphemes in the string of graphemes based on a first data structure; determining at least one grapheme representation for one or more of the one or more phonetic representations based on a second data structure; and constructing the phonetic representation of the string of characters based on the grapheme representation that was determined.}
}
• V. Raemy, V. Russo, J. Hennebert, and B. Wicht, “Method for writing a foreign language in a pseudo language phonetically resembling native language of the speaker,” , iss. US10102203B2, 2018.
[Bibtex]
@patent{Raemy2018US10102203B2,
author = {Vincent Raemy and Vincenzo Russo and Jean Hennebert and Baptiste Wicht},
title = {Method for writing a foreign language in a pseudo language phonetically resembling native language of the speaker},
year = {2018},
month = {10},
day = {16},
number = {US10102203B2},
location = {US},
url = {https://worldwide.espacenet.com/publicationDetails/biblio?CC=US&NR=10102203B2&KC=B2&FT=D},
filing_num = {14977022},
yearfiled = {2015},
monthfiled = {12},
dayfiled = {21},
abstract = {Provided is a method, device, and computer-readable medium for converting a string of characters in a first language into a phonetic representation of a second language using a first data structure that maps graphemes in the first language to one or more universal phonetic representations based on an international phonetic alphabet, wherein the first data structure comprises a plurality of first nodes with each first node of the plurality of first nodes having a respective weight assigned that corresponds to a likely pronunciation of a grapheme, and a second data structure that maps the one or more universal phonetic representations to one or more graphemes in the second language, wherein the second data structure comprises a plurality of second nodes with each second node of the plurality of second nodes having a respective weight assigned that corresponds to a likely representation of a grapheme in the second language.}
}
• V. Raemy, V. Russo, J. Hennebert, and B. Wicht, “Construction of a phonetic representation of a generated string of characters,” , iss. US10102189B2, 2018.
[Bibtex]
@patent{Raemy2018US10102189B2,
author = {Vincent Raemy and Vincenzo Russo and Jean Hennebert and Baptiste Wicht},
title = {Construction of a phonetic representation of a generated string of characters},
year = {2018},
month = {10},
day = {16},
number = {US10102189B2},
location = {US},
url = {https://worldwide.espacenet.com/publicationDetails/biblio?CC=US&NR=10102189B2&KC=B2&FT=D},
filing_num = {14977090},
yearfiled = {2015},
monthfiled = {12},
dayfiled = {21},
abstract = {Provided are methods, devices, and computer-readable media for generating a string of characters based on a set of rules; parsing the string of characters into string of graphemes; determining one or more phonetic representations for one or more graphemes in the string of graphemes based on a first data structure; determining at least one grapheme representation for one or more of the one or more phonetic representations based on a second data structure; and constructing the phonetic representation of the string of characters based on the grapheme representation that was determined.}
}
• V. Raemy, V. Russo, J. Hennebert, and B. Wicht, “Systems and methods for automatic phonetization of domain names,” , iss. US9947311B2, 2018.
[Bibtex]
@patent{Raemy2018US9947311B2,
author = {Vincent Raemy and Vincenzo Russo and Jean Hennebert and Baptiste Wicht},
title = {Systems and methods for automatic phonetization of domain names},
year = {2018},
month = {04},
day = {17},
number = {US9947311B2},
location = {US},
url = {https://worldwide.espacenet.com/publicationDetails/biblio?CC=US&NR=9947311B2&KC=B2&FT=D},
filing_num = {14977133},
yearfiled = {2015},
monthfiled = {12},
dayfiled = {21},
abstract = {A method can include receiving, from a user, a string of characters. The method can also include determining components of the string of characters. The components of the string of characters may include one or more graphemes that are related in the string of characters. The method can include determining universal phonetic representations for the components of the string of characters. The method can also include determining pronunciations for the universal phonetic representations. Additionally, the method can include constructing a pronunciation of the string of characters based at least partially on the pronunciations of the universal phonetic representations. Further, the method can include sending, to the user, a sound file representing the pronunciation of the string of characters.}
}
• L. Rychener and J. Hennebert, “Machine Learning for Anomaly Detection in Time-Series Produced by Industrial Processes,” in FTAL conference on Industrial Applied Data Science, 2018, p. 15–16.
[Bibtex]
@InProceedings{ftalconference2018lorenz,
author = {Lorenz Rychener and Jean Hennebert},
title = {Machine Learning for Anomaly Detection in Time-Series Produced by Industrial Processes},
booktitle = {FTAL conference on Industrial Applied Data Science},
pages = {15--16},
year = {2018},
month = {oct},
isbn = {978-2-8399-2549-5},
}
• L. Linder, J. Hennebert, and J. Esseiva, “BBDATA, a Big Data platform for Smart Buildings,” in FTAL conference on Industrial Applied Data Science, 2018, p. 38–39.
[Bibtex]
@InProceedings{bbdata2018ftal,
author = {Lucy Linder and Jean Hennebert and Julien Esseiva},
title = {BBDATA, a Big Data platform for Smart Buildings},
booktitle = {FTAL conference on Industrial Applied Data Science},
pages = {38--39},
year = {2018},
month = {oct},
isbn = {978-2-8399-2549-5},
}
• K. R. Martin, K. Mansouri, R. N. Weinreb, R. Wasilewicz, C. Gisler, J. Hennebert, D. Genoud, T. Shaarawy, C. Erb, N. Pfeiffer, G. E. Trope, F. A. Medeiros, Y. Barkana, J. H. K. Liu, R. Ritch, A. Mermoud, D. Jinapriya, C. Birt, I. I. Ahmed, C. Kranemann, P. HÃ¶h, B. Lachenmayr, Y. Astakhov, E. Chen, S. Duch, G. Marchini, S. Gandolfi, M. Rekas, A. Kuroyedov, A. Cernak, V. Polo, J. Belda, S. Grisanti, C. Baudouin, J. Nordmann, C. D. G. Moraes, Z. Segal, M. Lusky, H. Morori-Katz, N. Geffen, S. Kurtz, J. Liu, D. L. Budenz, O. J. Knight, J. C. Mwanza, A. Viera, F. Castanera, and J. Che-Hamzah, “Use of Machine Learning on Contact Lens Sensorâ€“Derived Parameters for the Diagnosis of Primary Open-angle Glaucoma,” American Journal of Ophthalmology, vol. 194, pp. 46-53, 2018.
[Bibtex]
@article{keith2018lens,
title = "Use of Machine Learning on Contact Lens Sensorâ€“Derived Parameters for the Diagnosis of Primary Open-angle Glaucoma",
journal = "American Journal of Ophthalmology",
volume = "194",
pages = "46 - 53",
year = "2018",
issn = "0002-9394",
doi = "https://doi.org/10.1016/j.ajo.2018.07.005",
url = "http://www.sciencedirect.com/science/article/pii/S0002939418303866",
author = "Keith R. Martin and Kaweh Mansouri and Robert N. Weinreb and Robert Wasilewicz and Christophe Gisler and Jean Hennebert and Dominique Genoud and Tarek Shaarawy and Carl Erb and Norbert Pfeiffer and Graham E. Trope and Felipe A. Medeiros and Yaniv Barkana and John H.K. Liu and Robert Ritch and AndrÃ© Mermoud and Delan Jinapriya and Catherine Birt and Iqbal I. Ahmed and Christoph Kranemann and Peter HÃ¶h and Bernhard Lachenmayr and Yuri Astakhov and Enping Chen and Susana Duch and Giorgio Marchini and Stefano Gandolfi and Marek Rekas and Alexander Kuroyedov and Andrej Cernak and Vicente Polo and JosÃ© Belda and Swaantje Grisanti and Christophe Baudouin and Jean-Philippe Nordmann and Carlos G. De Moraes and Zvi Segal and Moshe Lusky and Haia Morori-Katz and Noa Geffen and Shimon Kurtz and Ji Liu and Donald L. Budenz and O'Rese J. Knight and Jean Claude Mwanza and Anthony Viera and Fernando Castanera and Jemaima Che-Hamzah",
abstract = "Purpose
To test the hypothesis that contact lens sensor (CLS)-based 24-hour profiles of ocular volume changes contain information complementary to intraocular pressure (IOP) to discriminate between primary open-angle glaucoma (POAG) and healthy (H) eyes.
Design
Development and evaluation of a diagnostic test with machine learning.
Methods
Subjects: From 435 subjects (193 healthy and 242 POAG), 136 POAG and 136 age-matched healthy subjects were selected. Subjects with contraindications for CLS wear were excluded. Procedure: This is a pooled analysis of data from 24 prospective clinical studies and a registry. All subjects underwent 24-hour CLS recording on 1 eye. Statistical and physiological CLS parameters were derived from the signal recorded. CLS parameters frequently associated with the presence of POAG were identified using a random forest modeling approach. Main Outcome Measures: Area under the receiver operating characteristic curve (ROC AUC) for feature sets including CLS parameters and Start IOP, as well as a feature set with CLS parameters and Start IOP combined.
Results
The CLS parameters feature set discriminated POAG from H eyes with mean ROC AUCs of 0.611, confidence interval (CI) 0.493â€“0.722. Larger values of a given CLS parameter were in general associated with a diagnosis of POAG. The Start IOP feature set discriminated between POAG and H eyes with a mean ROC AUC of 0.681, CI 0.603â€“0.765. The combined feature set was the best indicator of POAG with an ROC AUC of 0.759, CI 0.654â€“0.855. This ROC AUC was statistically higher than for CLS parameters or Start IOP feature sets alone (both P < .0001).
Conclusions
CLS recordings contain information complementary to IOP that enable discrimination between H and POAG. The feature set combining CLS parameters and Start IOP provide a better indication of the presence of POAG than each of the feature sets separately. As such, the CLS may be a new biomarker for POAG."
}
• B. Wicht, A. Fischer, and J. Hennebert, "Seamless GPU Evaluation of Smart Expression Templates," in 2018 International Conference on High Performance Computing Simulation (HPCS), 2018, pp. 196-203.
[Bibtex]
@inproceedings{wicht2018gpu,
author={B. {Wicht} and A. {Fischer} and J. {Hennebert}},
booktitle={2018 International Conference on High Performance Computing Simulation (HPCS)},
title={Seamless GPU Evaluation of Smart Expression Templates},
year={2018},
volume={},
number={},
pages={196-203},
abstract={Expression Templates is a technique allowing to write linear algebra code in C++ the same way it would be written on paper. It is also used extensively as a performance optimization technique, especially as the Smart Expression Templates form which allows for even higher performance. It has proved to be very efficient for computation on a Central Processing Unit (CPU). However, due to its design, it is not easily implemented on a Graphics Processing Unit (GPU). In this paper, we devise a set of techniques to allow the seamless evaluation of Smart Expression Templates on the GPU. The execution is transparent for the user of the library which still uses the matrices and vector as if it was on the CPU and profits from the performance and higher multi-processing capabilities of the GPU. We also show that the GPU version is significantly faster than the CPU version, without any change to the code of the user.},
keywords={C++ language;graphics processing units;matrix algebra;optimisation;parallel processing;software performance evaluation;CPU;seamless evaluation;GPU version;linear algebra code;performance optimization technique;central processing unit;graphics processing unit;GPU evaluation;multiprocessing capabilities;smart expression templates form;Graphics processing units;Kernel;Libraries;C++ languages;Runtime;Central Processing Unit;High performance computing},
doi={10.1109/HPCS.2018.00045},
ISSN={},
month={July}
}
• B. Wicht, A. Fischer, and J. Hennebert, "DLL: A Fast Deep Neural Network Library," in Artificial Neural Networks in Pattern Recognition, L. Pancionia, F. Schwenker, and E. Trentin, Eds., Springer International Publishing, 2018, p. 54–65.
[Bibtex]
@inbook{wicht18dll,
Author = {B. Wicht and A. Fischer and J. Hennebert},
Booktitle = {Artificial Neural Networks in Pattern Recognition},
Date-Modified = {2018-10-22 09:07:00 +0000},
Editor = {Pancionia, Luca and Schwenker, Friedhelm and Trentin, Edmondo},
Isbn = "978-3-319-99978-4",
Doi = "10.1007/978-3-319-99978-4",
Pages = {54--65},
Publisher = {Springer International Publishing},
Series = {Lecture Notes in Artificial Intelligence},
Title = {{DLL}: A Fast Deep Neural Network Library},
Year = {2018}}
• O. Zayene, S. Masmoudi Touj, J. Hennebert, R. Ingold, and N. Essoukri Ben Amara, "Multi-dimensional long short-term memory networks for artificial Arabic text recognition in news video," IET Computer Vision, 2018.
[Bibtex]
@ARTICLE{ietzayene2018,
author = {Zayene, Oussama and Masmoudi Touj, Sameh and Hennebert, Jean and Ingold, Rolf and Essoukri Ben Amara, Najoua},
keywords = {text pattern variability;public AcTiV-R dataset;artificial Arabic video text recognition;evaluation protocols;Arabic character models;multimedia document annotation;segmentation-free method;line levels;nonuniform intraword distances;news video;public dataset ALIF;recurrent neural networks;multidimensional long short-term memory networks;interword distances;diacritic marks;multimedia document indexing;embedded texts;connectionist temporal classification layer;},
ISSN = {1751-9632},
language = {English},
abstract = {This study presents a novel approach for Arabic video text recognition based on recurrent neural networks. In fact, embedded texts in videos represent a rich source of information for indexing and automatically annotating multimedia documents. However, video text recognition is a non-trivial task due to many challenges like the variability of text patterns and the complexity of backgrounds. In the case of Arabic, the presence of diacritic marks, the cursive nature of the script and the non-uniform intra/inter word distances, may introduce many additional challenges. The proposed system presents a segmentation-free method that relies specifically on a multi-dimensional long short-term memory coupled with a connectionist temporal classification layer. It is shown that using an efficient pre-processing step and a compact representation of Arabic character models brings robust performance and yields a low-error rate than other recently published methods. The authorsâ€™ system is trained and evaluated using the public AcTiV-R dataset under different evaluation protocols. The obtained results are very interesting. They also outperform current state-of-the-art approaches on the public dataset ALIF in terms of recognition rates at both character and line levels.},
title = {Multi-dimensional long short-term memory networks for artificial Arabic text recognition in news video},
journal = {IET Computer Vision},
year = {2018},
month = {March},
publisher ={Institution of Engineering and Technology},
url = {http://digital-library.theiet.org/content/journals/10.1049/iet-cvi.2017.0468},
DOI = {10.1049/iet-cvi.2017.0468}
}
• O. Zayene, S. Masmoudi Touj, J. Hennebert, R. Ingold, and N. Essoukri Ben Amara, "Open Datasets and Tools for Arabic Text Detection and Recognition in News Video Frames," Journal of Imaging, vol. 4, iss. 2, 2018.
[Bibtex]
@Article{jimagingzayene2018,
AUTHOR = {Zayene, Oussama and Masmoudi Touj, Sameh and Hennebert, Jean and Ingold, Rolf and Essoukri Ben Amara, Najoua},
TITLE = {Open Datasets and Tools for Arabic Text Detection and Recognition in News Video Frames},
JOURNAL = {Journal of Imaging},
VOLUME = {4},
YEAR = {2018},
NUMBER = {2},
URL = {http://www.mdpi.com/2313-433X/4/2/32},
ISSN = {2313-433X},
ABSTRACT = {Recognizing texts in video is more complex than in other environments such as scanned documents. Video texts appear in various colors, unknown fonts and sizes, often affected by compression artifacts and low quality. In contrast to Latin texts, there are no publicly available datasets which cover all aspects of the Arabic Video OCR domain. This paper describes a new well-defined and annotated Arabic-Text-in-Video dataset called AcTiV 2.0. The dataset is dedicated especially to building and evaluating Arabic video text detection and recognition systems. AcTiV 2.0 contains 189 video clips serving as a raw material for creating 4063 key frames for the detection task and 10,415 cropped text images for the recognition task. AcTiV 2.0 is also distributed with its annotation and evaluation tools that are made open-source for standardization and validation purposes. This paper also reports on the evaluation of several systems tested under the proposed detection and recognition protocols.},
DOI = {10.3390/jimaging4020032}
}
• K. Chen, M. Seuret, J. Hennebert, and R. Ingold, "Convolutional Neural Networks for Page Segmentation of Historical Document Images," in Proc. 14th Int. Conf. on Document Analysis and Recognition ICDAR, 2017, p. 965–970.
[Bibtex]
@inproceedings{chen2017icdar,
Author = {Kai Chen and Mathias Seuret and Jean Hennebert and Rolf Ingold},
Booktitle = {Proc. 14th Int. Conf. on Document Analysis and Recognition ICDAR},
Title = {Convolutional Neural Networks for Page Segmentation of Historical Document Images},
Pages = {965--970},
Year = {2017}}
• L. Linder, D. Vionnet, J. Bacher, and J. Hennebert, "Big Building Data - a Big Data Platform for Smart Buildings," Energy Procedia, vol. 122, pp. 589-594, 2017.
[Bibtex]
@article{2017lindercisbat,
title = "Big Building Data - a Big Data Platform for Smart Buildings",
journal = "Energy Procedia",
volume = "122",
pages = "589 - 594",
year = "2017",
note = "CISBAT 2017 International ConferenceFuture Buildings & Districts â€“ Energy Efficiency from Nano to Urban Scale",
issn = "1876-6102",
doi = "10.1016/j.egypro.2017.07.354",
url = "http://www.sciencedirect.com/science/article/pii/S1876610217329582",
author = "Lucy Linder and Damien Vionnet and Jean-Philippe Bacher and Jean Hennebert",
keywords = "Big Data, Building Management Systems, Smart Buildings, Web of Buildings",
abstract = "Abstract Future buildings will more and more rely on advanced Building Management Systems (BMS) connected to a variety of sensors, actuators and dedicated networks. Their objectives are to observe the state of rooms and apply automated rules to preserve or increase comfort while economizing energy. In this work, we advocate for the inclusion of a dedicated system for sensors data storage and processing, based on Big Data technologies. This choice enables new potentials in terms of data analytics and applications development, the most obvious one being the ability to scale up seamlessly from one smart building to several, in the direction of smart areas and smart cities. We report in this paper on our system architecture and on several challenges we met in its elaboration, attempting to meet requirements of scalability, data processing, flexibility, interoperability and privacy. We also describe current and future end-user services that our platform will support, including historical data retrieval, visualisation, processing and alarms. The platform, called BBData - Big Building Data, is currently in production at the Smart Living Lab of Fribourg and is offered to several research teams to ease their work, to foster the sharing of historical data and to avoid that each project develops its own data gathering and processing pipeline."
}
• F. Rossier, P. Lang, and J. Hennebert, "Near Real-Time Appliance Recognition Using Low Frequency Monitoring and Active Learning Methods," Energy Procedia, vol. 122, pp. 691-696, 2017.
[Bibtex]
@article{2017rossiercisbat,
title = "Near Real-Time Appliance Recognition Using Low Frequency Monitoring and Active Learning Methods",
journal = "Energy Procedia",
volume = "122",
pages = "691 - 696",
year = "2017",
note = "CISBAT 2017 International ConferenceFuture Buildings & Districts â€“ Energy Efficiency from Nano to Urban Scale",
issn = "1876-6102",
doi = "10.1016/j.egypro.2017.07.371",
url = "http://www.sciencedirect.com/science/article/pii/S1876610217329752",
author = "Florian Rossier and Philippe Lang and Jean Hennebert",
keywords = "NILM, Appliance recognition, active learning",
abstract = "Abstract Electricity load monitoring in residential buildings has become an important task allowing for energy consumption understanding, indirect human activity recognition and occupancy modelling. In this context, Non Intrusive Load Monitoring (NILM) is an approach based on the analysis of the global electricity consumption signal of the habitation. Current NILM solutions are reaching good precision for the identification of electrical devices but at the cost of difficult setups with expensive equipments typically working at high frequency. In this work we propose to use a low-cost and easy to install low frequency sensor for which we improve the performances with an active machine learning strategy. At setup, the system is able to identify some appliances with typical signatures such as a fridge. During usage, the system detects unknown signatures and provides a user-friendly procedure to include new appliances and to improve the identification precision over time."
}
• F. Slimane, R. Ingold, and J. Hennebert, "ICDAR2017 Competition on Multi-font and Multi-size Digitally Represented Arabic Text," in Proc. 14th Int. Conf. on Document Analysis and Recognition ICDAR, 2017, p. 1466–1472.
[Bibtex]
@inproceedings{slimane2017icdar,
Author = {Fouad Slimane and Rolf Ingold and Jean Hennebert},
Booktitle = {Proc. 14th Int. Conf. on Document Analysis and Recognition ICDAR},
Title = {ICDAR2017 Competition on Multi-font and Multi-size Digitally Represented Arabic Text},
Pages = {1466--1472},
Year = {2017},
abstract = {This paper describes the organisation and results of the Arabic Recognition Competition: Multi-font Multi-size Digitally Represented Text held in the context of the 14th International Conference on Document Analysis and Recognition (ICDARâ€™2017), during November 10-15, 2017, Kyoto, Japan. This competition has used the freely available Arabic Printed Text Image (APTI) database. A first and second editions took place respectively in ICDARâ€™2011 and ICDARâ€™2013. In this edition, we propose four challenges. Six research groups are participating in the competition with thirteen systems. These systems are compared using the font, font-size, font and fontsize, and character and word recognition rates. The systems were tested in a blind manner using the first 5000 images of APTI database set 6. A short description of the participating groups, their systems, the experimental setup, and the observed results are presented.}}
• O. Zayene, J. Hennebert, R. Ingold, and N. E. BenAmara, "ICDAR2017 Competition on Arabic Text Detection and Recognition in Multi-resolution Video Frames," in Proc. 14th Int. Conf. on Document Analysis and Recognition ICDAR, 2017, p. 1460–1465.
[Bibtex]
@inproceedings{zayene2017icdar,
Author = {Oussama Zayene and Jean Hennebert and Rolf Ingold and Najoua Essoukri BenAmara},
Booktitle = {Proc. 14th Int. Conf. on Document Analysis and Recognition ICDAR},
Title = {ICDAR2017 Competition on Arabic Text Detection and Recognition in Multi-resolution Video Frames},
Pages = {1460--1465},
Year = {2017}}
• K. Chen, M. Seuret, M. Liwicki, J. Hennebert, C. Liu, and R. Ingold, "Page Segmentation for Historical Handwritten Document Images Using Conditional Random Fields," in 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), 2016, pp. 90-95.
[Bibtex]
@INPROCEEDINGS{chen2016:icfhr,
author={Kai Chen and Mathias Seuret and Marcus Liwicki and Jean Hennebert and Cheng-Lin Liu and Rolf Ingold},
booktitle={2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR)},
title={Page Segmentation for Historical Handwritten Document Images Using Conditional Random Fields},
year={2016},
pages={90-95},
abstract={In this paper, we present a Conditional Random Field (CRF) model to deal with the problem of segmenting handwritten historical document images into different regions. We consider page segmentation as a pixel-labeling problem, i.e., each pixel is assigned to one of a set of labels. Features are learned from pixel intensity values with stacked convolutional autoencoders in an unsupervised manner. The features are used for the purpose of initial classification with a multilayer perceptron. Then a CRF model is introduced for modeling the local and contextual information jointly in order to improve the segmentation. For the purpose of decreasing the time complexity, we perform labeling at superpixel level. In the CRF model, graph nodes are represented by superpixels. The label of each pixel is determined by the label of the superpixel to which it belongs. Experiments on three public datasets demonstrate that, compared to previous methods, the proposed method achieves more accurate segmentation results and is much faster.},
keywords={document image processing,graph theory,handwriting recognition,image classification,image segmentation,multilayer perceptrons, unsupervised learning, CRF, conditional random field, historical handwritten document, stacked convolutional autoencoders, superpixel, Autoencoder},
doi={10.1109/ICFHR.2016.0029},
ISSN={2167-6445},
month={Oct},
pdf={http://www.hennebert.org/download/publications/icfhr-2016-page-segmentation-for-historical-handwritten-document-images-using-conditional-random-fields.pdf}}
• A. Ridi, C. Gisler, and J. Hennebert, "Aggregation procedure of Gaussian Mixture Models for additive features," in 23rd International Conference on Pattern Recognition (ICPR), 2016, pp. 2545-2550.
[Bibtex]
@conference{ridi-icpr2016,
author = "Antonio Ridi and Christophe Gisler and Jean Hennebert",
abstract = "In this work we provide details on a new and effective approach able to generate Gaussian Mixture Models (GMMs) for the classification of aggregated time series. More specifically, our procedure can be applied to time series that are aggregated together by adding their features. The procedure takes advantage of the additive property of the Gaussians that complies with the additive property of the features. Our goal is to classify aggregated time series, i.e. we aim to identify the classes of the single time series contributing to the total. The standard approach consists in training the models using the combination of several time series coming from different classes. However, this has the drawback of being a very slow operation given the amount of data. The proposed approach, called GMMs aggregation procedure, addresses this problem. It consists of three steps: (i) modeling the independent classes, (ii) generation of the models for the class combinations and (iii) simplification of the generated models. We show the effectiveness of our approach by using time series in the context of electrical appliance consumption, where the time series are aggregated by adding the active and reactive power. Finally, we compare the proposed approach with the standard procedure.",
booktitle = "23rd International Conference on Pattern Recognition (ICPR)",
editor = "IEEE",
keywords = "machine learning, electric signal, appliance signatures, GMMs",
month = "December",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
pages = "2545-2550",
title = "{A}ggregation procedure of {G}aussian {M}ixture {M}odels for additive features",
year = "2016",
}
• B. Wicht, A. Fischer, and J. Hennebert, "Deep Learning Features for Handwritten Keyword Spotting," in 23rd International Conference on Pattern Recognition (ICPR), 2016, pp. 3423-3428.
[Bibtex]
@conference{wicht:icpr2016,
author = "Baptiste Wicht and Andreas Fischer and Jean Hennebert",
abstract = "Deep learning had a significant impact on diverse pattern recognition tasks in the recent past. In this paper, we investigate its potential for keyword spotting in handwritten documents by designing a novel feature extraction system based on Convolutional Deep Belief Networks. Sliding window features are learned from word images in an unsupervised manner. The proposed features are evaluated both for template-based word spotting with Dynamic Time Warping and for learning-based word spotting with Hidden Markov Models. In an experimental evaluation on three benchmark data sets with historical and modern handwriting, it is shown that the proposed learned features outperform three standard sets of handcrafted features.",
booktitle = "23rd International Conference on Pattern Recognition (ICPR)",
editor = "IEEE",
keywords = "Handwriting Recognition, Deep learning, Artificial neural networks, keyword spotting",
month = "December",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
pages = "3423-3428",
title = "{D}eep {L}earning {F}eatures for {H}andwritten {K}eyword {S}potting",
year = "2016",
}
• B. Wicht, A. Fischer, and J. Hennebert, "On CPU Performance Optimization of Restricted Boltzmann Machine and Convolutional RBM," in Artificial Neural Networks in Pattern Recognition: 7th IAPR TC3 Workshop, ANNPR 2016, Ulm, Germany, September 28–30, 2016, Proceedings, F. Schwenker, H. M. Abbas, N. El Gayar, and E. Trentin, Eds., Cham: Springer International Publishing, 2016, p. 163–174.
[Bibtex]
@inbook{wicht:2016annpr,
author = "Baptiste Wicht and Andreas Fischer and Jean Hennebert",
booktitle = "Artificial Neural Networks in Pattern Recognition: 7th IAPR TC3 Workshop, ANNPR 2016, Ulm, Germany, September 28--30, 2016, Proceedings",
doi = "10.1007/978-3-319-46182-3_14",
editor = "Schwenker, Friedhelm
and Abbas, M. Hazem
and El Gayar, Neamat
and Trentin, Edmondo",
isbn = "978-3-319-46182-3",
pages = "163--174",
publisher = "Springer International Publishing",
title = "{O}n {CPU} {P}erformance {O}ptimization of {R}estricted {B}oltzmann {M}achine and {C}onvolutional {RBM}",
url = "http://dx.doi.org/10.1007/978-3-319-46182-3_14",
year = "2016",
}
• B. Wicht, A. Fischer, and J. Hennebert, "Keyword Spotting with Convolutional Deep Belief Networks and Dynamic Time Warping," in Artificial Neural Networks and Machine Learning – ICANN 2016: 25th International Conference on Artificial Neural Networks, Barcelona, Spain, September 6-9, 2016, Proceedings, Part II, A. E. P. Villa, P. Masulli, and A. J. Pons Rivero, Eds., Cham: Springer International Publishing, 2016, p. 113–120.
[Bibtex]
@Inbook{wicht:2016icann,
author="Wicht, Baptiste
and Fischer, Andreas
and Hennebert, Jean",
editor="Villa, Alessandro E.P.
and Masulli, Paolo
and Pons Rivero, Antonio Javier",
title="Keyword Spotting with Convolutional Deep Belief Networks and Dynamic Time Warping",
bookTitle="Artificial Neural Networks and Machine Learning -- ICANN 2016: 25th International Conference on Artificial Neural Networks, Barcelona, Spain, September 6-9, 2016, Proceedings, Part II",
year="2016",
publisher="Springer International Publishing",
pages="113--120",
isbn="978-3-319-44781-0",
doi="10.1007/978-3-319-44781-0_14",
url="http://dx.doi.org/10.1007/978-3-319-44781-0_14"
}
• O. Zayene, N. Hajjej, S. Masmoudi Touj, S. Ben Mansour, J. Hennebert, R. Ingold, and N. Amara, "ICPR2016 contest on Arabic Text detection and Recognition in video frames - AcTiVComp." 2016, pp. 187-191.
[Bibtex]
@inproceedings{oussama2016icpr,
author = {Zayene, Oussama and Hajjej, Nadia and Masmoudi Touj, Sameh and Ben Mansour, Soumaya and Hennebert, Jean and Ingold, Rolf and Amara, Najoua},
year = {2016},
month = {12},
pages = {187-191},
title = {ICPR2016 contest on Arabic Text detection and Recognition in video frames - AcTiVComp},
doi = {10.1109/ICPR.2016.7899631}
}
• O. Zayene, M. Seuret, S. M. Touj, J. Hennebert, R. Ingold, and N. E. B. Amara, "Text Detection in Arabic News Video Based on SWT Operator and Convolutional Auto-Encoders," in 2016 12th IAPR Workshop on Document Analysis Systems (DAS), 2016, pp. 13-18.
[Bibtex]
@inproceedings{oussama2016das,
author={O. {Zayene} and M. {Seuret} and S. M. {Touj} and J. {Hennebert} and R. {Ingold} and N. E. B. {Amara}},
booktitle={2016 12th IAPR Workshop on Document Analysis Systems (DAS)},
title={Text Detection in Arabic News Video Based on SWT Operator and Convolutional Auto-Encoders},
year={2016},
volume={},
number={},
pages={13-18},
keywords={image coding;image filtering;natural language processing;text detection;transforms;unsupervised learning;video signal processing;visual databases;text specificities;antialiasing artifacts;horizontally aligned artificial text detection;Arabic news video;stroke width transform algorithm;SWT algorithm;convolutional autoencoder;text candidate components;geometric constraints;stroke width information;CAE;unsupervised feature learning method;textline candidates;Arabic-text-in-video database;AcTiV-DB;evaluation protocols;TV channels;compression artifacts;Feature extraction;Computer aided engineering;Image edge detection;Learning systems;Training;Filtering algorithms;Support vector machines;Arabic text detection;SWT operator;CAE;AcTiV-DB},
doi={10.1109/DAS.2016.80},
ISSN={},
month={April}
}
• O. Zayene, S. M. Touj, J. Hennebert, R. Ingold, and N. E. Ben Amara, "Data, protocol and algorithms for performance evaluation of text detection in Arabic news video," in 2016 2nd International Conference on Advanced Technologies for Signal and Image Processing (ATSIP), 2016, pp. 258-263.
[Bibtex]
@inproceedings{oussama2016atsip,
author={O. {Zayene} and S. M. {Touj} and J. {Hennebert} and R. {Ingold} and N. E. {Ben Amara}},
booktitle={2016 2nd International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)},
title={Data, protocol and algorithms for performance evaluation of text detection in Arabic news video},
year={2016},
volume={},
number={},
pages={258-263},
abstract={Benchmark datasets and their corresponding evaluation protocols are commonly used by the computer vision community, in a variety of application domains, to assess the performance of existing systems. Even though text detection and recognition in video has seen much progress in recent years, relatively little work has been done to propose standardized annotations and evaluation protocols especially for Arabic Video-OCR systems. In this paper, we present a framework for evaluating text detection in videos. Additionally, dataset, ground-truth annotations and evaluation protocols, are provided for Arabic text detection. Moreover, two published text detection algorithms are tested on a part of the AcTiV database and evaluated using a set of the proposed evaluation protocols.},
keywords={computer vision;natural language processing;optical character recognition;performance evaluation;text detection;video signal processing;performance evaluation;text detection;Arabic news video;computer vision;Arabic video-OCR system;Protocols;Databases;Image edge detection;Optical character recognition software;Detection algorithms;Detectors;XML;text detection;Evaluation Protocol;AcTiV database;Arabic Video-OCR},
doi={10.1109/ATSIP.2016.7523079},
ISSN={},
month={March}
}
• K. Chen, M. Seuret, H. Wei, M. Liwicki, J. Hennebert, and R. Ingold, "Ground truth model, tool, and dataset for layout analysis of historical documents," in SPIE Electronic Imaging 2015, 2015.
[Bibtex]
@conference{chen2015spie,
Author = {Kai Chen and Mathias Seuret and Hao Wei and Marcus Liwicki and Jean Hennebert and Rolf Ingold},
Booktitle = {SPIE Electronic Imaging 2015},
Keywords = {machine learning, image analysis, historical documents},
Month = {February},
Publisher = {SPIE Eloctronic Imaging},
Title = {{G}round truth model, tool, and dataset for layout analysis of historical documents},
Year = {2015}}
• K. Chen, M. Seuret, M. Liwicki, J. Hennebert, and R. Ingold, "Page segmentation of historical document images with convolutional autoencoders," in 2015 13th International Conference on Document Analysis and Recognition (ICDAR), 2015, pp. 1011-1015.
[Bibtex]
@INPROCEEDINGS{chen2015:icdar,
author={K. Chen and M. Seuret and M. Liwicki and J. Hennebert and R. Ingold},
booktitle={2015 13th International Conference on Document Analysis and Recognition (ICDAR)},
title={Page segmentation of historical document images with convolutional autoencoders},
year={2015},
pages={1011-1015},
abstract={In this paper, we present an unsupervised feature learning method for page segmentation of historical handwritten documents available as color images. We consider page segmentation as a pixel labeling problem, i.e., each pixel is classified as either periphery, background, text block, or decoration. Traditional methods in this area rely on carefully hand-crafted features or large amounts of prior knowledge. In contrast, we apply convolutional autoencoders to learn features directly from pixel intensity values. Then, using these features to train an SVM, we achieve high quality segmentation without any assumption of specific topologies and shapes. Experiments on three public datasets demonstrate the effectiveness and superiority of the proposed approach.},
keywords={document image processing;handwritten character recognition;history;image colour analysis;image segmentation;support vector machines;unsupervised learning;SVM;color images;convolutional autoencoders;historical document images;historical handwritten documents;page segmentation;pixel intensity values;pixel labeling problem;support vector machine;unsupervised feature learning method;Image segmentation;Robustness;Support vector machines},
doi={10.1109/ICDAR.2015.7333914},
month={Aug},
pdf={http://www.hennebert.org/download/publications/icdar-2015-page-segmentation-of-historical-document-images-with-convolutional-autoencoders.pdf},}
• C. Gisler, A. Ridi, J. Hennebert, R. N. Weinreb, and K. Mansouri, "Automated Detection and Quantification of Circadian Eye Blinks Using a Contact Lens Sensor," Translational Vision Science and Technology (TVST), vol. 4, iss. 1, pp. 1-10, 2015.
[Bibtex]
@article{gisler2015automated,
Abstract = {Purpose: To detect and quantify eye blinks during 24-hour intraocular pressure (IOP) monitoring with a contact lens sensor (CLS). Methods: A total of 249 recordings of 24-hour IOP patterns from 202 participants using a CLS were included. Software was developed to automatically detect eye blinks, and wake and sleep periods. The blink detection method was based on detection of CLS signal peaks greater than a threshold proportional to the signal amplitude. Three methods for automated detection of the sleep and wake periods were evaluated. These relied on blink detection and subsequent comparison of the local signal amplitude with a threshold proportional to the mean signal amplitude. These methods were compared to manual sleep/wake verification. In a pilot, simultaneous video recording of 10 subjects was performed to compare the software to observer-measured blink rates. Results: Mean (SD) age of participants was 57.4 $\pm$ 16.5 years (males, 49.5%). There was excellent agreement between software-detected number of blinks and visually measured blinks for both observers (intraclass correlation coefficient [ICC], 0.97 for observer 1; ICC, 0.98 for observer 2). The CLS measured a mean blink frequency of 29.8 $\pm$ 15.4 blinks/min, a blink duration of 0.26 $\pm$ 0.21 seconds and an interblink interval of 1.91 $\pm$ 2.03 seconds. The best method for identifying sleep periods had an accuracy of 95.2 $\pm$ 0.5%. Conclusions: Automated analysis of CLS 24-hour IOP recordings can accurately quantify eye blinks, and identify sleep and wake periods. Translational Relevance: This study sheds new light on the potential importance of eye blinks in glaucoma and may contribute to improved understanding of circadian IOP characteristics.},
Author = {Christophe Gisler and Antonio Ridi and Jean Hennebert and Robert N Weinreb and Kaweh Mansouri},
Doi = {10.1167/tvst.4.1.4},
Journal = {Translational Vision Science and Technology (TVST)},
Keywords = {machine learning, bio-medical signals, glaucoma prediction},
Month = {January},
Note = {Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.},
Number = {1},
Pages = {1-10},
Publisher = {The Association for Research in Vision and Ophthalmology},
Title = {{A}utomated {D}etection and {Q}uantification of {C}ircadian {E}ye {B}links {U}sing a {C}ontact {L}ens {S}ensor},
Volume = {4},
Year = {2015},
Bdsk-Url-2 = {http://dx.doi.org/10.1167/tvst.4.1.4}}
• A. Ridi, N. Zarkadis, C. Gisler, and J. Hennebert, "Duration Models for Activity Recognition and Prediction in Buildings using Hidden Markov Models," in Proceedings of the 2015 International Conference on Data Science and Advanced Analytics (DSAA 2015), Paris, France, 2015, p. 10.
[Bibtex]
@inproceedings{RidiDSAA2015,
abstract = {Activity recognition and prediction in buildings can have multiple positive effects in buildings: improve elderly monitoring, detect intrusions, maximize energy savings and optimize occupant comfort. In this paper we apply human activity recognition by using data coming from a network of motion and door sensors distributed in a Smart Home environment. We use Hidden Markov Models (HMM) as the basis of a machine learning algorithm on data collected over an 8-month period from a single-occupant home available as part of the WSU CASAS Smart Home project. In the first implementation the HMM models 24 hours of activities and classifies them in 8 distinct activity categories with an accuracy rate of 84.6{\%}. To improve the identification rate and to help detect potential abnormalities related with the duration of an activity (i.e. when certain activities last too much), we implement minimum duration modeling where the algorithm is forced to remain in a certain state for a specific amount of time. Two subsequent implementations of the minimum duration HMM (mean-based length modeling and quantile length modeling) yield a further 2{\%} improvement of the identification rate. To predict the sequence of activities in the future, Artificial Neural Networks (ANN) are employed and identified activities clustered in 3 principal activity groups with an average accuracy rate of 71-77.5{\%}, depending on the forecasting window. To explore the energy savings potential, we apply thermal dynamic simulations on buildings in central European climate for a period of 65 days during the winter and we obtain energy savings for space heating of up to 17{\%} with 3-hour forecasting for two different types of buildings.},
author = {Ridi, Antonio and Zarkadis, Nikos and Gisler, Christophe and Hennebert, Jean},
booktitle = {Proceedings of the 2015 International Conference on Data Science and Advanced Analytics (DSAA 2015)},
editor = {Gaussier, Eric and Cao, Longbing},
file = {:Users/gislerc/Documents/Mendeley/Articles/Ridi et al/Proceedings of the 2015 International Conference on Data Science and Advanced Analytics (DSAA 2015)/Ridi et al. - 2015 - Duration Models for Activity Recognition and Prediction in Buildings using Hidden Markov Models.pdf:pdf},
isbn = {9781467382731},
keywords = {Activity recognition,Energy savings in buildings,Expanded Hidden,Markov Models,Minimum Duration modeling,activity recognition,energy savings in buildings,expanded hidden,markov models,minimum duration modeling},
mendeley-tags = {Activity recognition,Energy savings in buildings,Expanded Hidden,Markov Models,Minimum Duration modeling},
pages = {10},
publisher = {IEEE Computer Society},
title = {{Duration Models for Activity Recognition and Prediction in Buildings using Hidden Markov Models}},
url = {http://dsaa2015.lip6.fr},
year = {2015}
}
• A. Ridi, C. Gisler, and J. Hennebert, "Processing smart plug signals using machine learning," in Wireless Communications and Networking Conference Workshops (WCNCW), 2015 IEEE, 2015, pp. 75-80.
[Bibtex]
@conference{ridi2015wcnc,
Abstract = {The automatic identification of appliances through the analysis of their electricity consumption has several purposes in Smart Buildings including better understanding of the energy consumption, appliance maintenance and indirect observation of human activities. Electric signatures are typically acquired with IoT smart plugs integrated or added to wall sockets. We observe an increasing number of research teams working on this topic under the umbrella Intrusive Load Monitoring. This term is used as opposition to Non-Intrusive Load Monitoring that refers to the use of global smart meters. We first present the latest evolutions of the ACS-F database, a collections of signatures that we made available for the scientific community. The database contains different brands and/or models of appliances with up to 450 signatures. Two evaluation protocols are provided with the database to benchmark systems able to recognise appliances from their electric signature. We present in this paper two additional evaluation protocols intended to measure the impact of the analysis window length. Finally, we present our current best results using machine learning approaches on the 4 evaluation protocols.},
Author = {A. Ridi and C. Gisler and J. Hennebert},
Booktitle = {Wireless Communications and Networking Conference Workshops (WCNCW), 2015 IEEE},
Doi = {10.1109/WCNCW.2015.7122532},
Keywords = {learning,artificial intelligence, power engineering computing,power supplies to apparatus,ACS-F database,IoT smart plugs,machine learning approaches,smart buildings,smart plug signals,umbrella intrusive load monitoring,Accuracy,Databases,Hidden Markov mod},
Month = {March},
Note = {Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.},
Pages = {75-80},
Title = {{P}rocessing smart plug signals using machine learning},
Year = {2015},
Bdsk-Url-2 = {http://dx.doi.org/10.1109/WCNCW.2015.7122532}}
• A. Ridi, C. Gisler, and J. Hennebert, "User Interaction Event Detection in the Context of Appliance Monitoring," in The 13th International Conference on Pervasive Computing and Communications (PerCom 2015), Workshop on Pervasive Energy Services (PerEnergy), 2015, pp. 323-328.
[Bibtex]
@conference{ridi2015percom,
Abstract = {In this paper we assess about the recognition of User Interaction events when handling electrical devices. This work is placed in the context of Intrusive Load Monitoring used for appliance recognition. ILM implies several Smart Metering Sensors to be placed inside the environment under analysis (in our case we have one Smart Metering Sensor per device). Our existing system is able to recognise the appliance class (as coffee machine, printer, etc.) and the sequence of states (typically Active / Non-Active) by using Hidden Markov Models as machine learning algorithm. In this paper we add a new layer to our system architecture called User Interaction Layer, aimed to infer the moments (called User Interaction events) during which the user interacts with the appliance. This layer uses as input the information coming from HMM (i.e. the recognised appliance class and the sequence of states). The User Interaction events are derived from the analysis of the transitions in the sequences of states and a ruled-based system adds or removes these events depending on the recognised class. Finally we compare the list of events with the ground truth and we obtain three different accuracy rates: (i) 96.3% when the correct model and the real sequence of states are known a priori, (ii) 82.5% when only the correct model is known and (iii) 80.5% with no a priori information.},
Author = {Antonio Ridi and Christophe Gisler and Jean Hennebert},
Booktitle = {The 13th International Conference on Pervasive Computing and Communications (PerCom 2015), Workshop on Pervasive Energy Services (PerEnergy)},
Doi = {10.1109/PERCOMW.2015.7134056},
Keywords = {domestic appliances;hidden Markov models;home automation;human computer interaction;learning (artificial intelligence);smart meters;HMM;ILM;appliance monitoring;appliance recognition;electrical devices;hidden Markov models;intrusive load monitoring;machine learning algorithm;ruled-based system;smart metering sensors;user interaction event detection;user interaction layer;Accuracy;Databases;Hidden Markov models;Home appliances;Mobile handsets;Monitoring;Senior citizens;Appliance Identification;Intrusive Load Monitoring (ILM);User-Appliance Interaction},
Note = {Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.},
Pages = {323-328},
Title = {{U}ser {I}nteraction {E}vent {D}etection in the {C}ontext of {A}ppliance {M}onitoring},
Year = {2015},
Bdsk-Url-2 = {http://dx.doi.org/10.1109/PERCOMW.2015.7134056}}
• O. Zayene, J. Hennebert, M. S. Touj, R. Ingold, and E. B. N. Amara, "A dataset for Arabic text detection, tracking and recognition in news videos- AcTiV," in 2015 13th International Conference on Document Analysis and Recognition (ICDAR), 2015, pp. 996-1000.
[Bibtex]
@INPROCEEDINGS{zayene2015:icdar,
author={O. Zayene and J. Hennebert and S. Masmoudi Touj and R. Ingold and N. Essoukri Ben Amara},
booktitle={2015 13th International Conference on Document Analysis and Recognition (ICDAR)},
title={A dataset for Arabic text detection, tracking and recognition in news videos- AcTiV},
year={2015},
pages={996-1000},
abstract={Recently, promising results have been reported on video text detection and recognition. Most of the proposed methods are tested on private datasets with non-uniform evaluation metrics. We report here on the development of a publicly accessible annotated video dataset designed to assess the performance of different artificial Arabic text detection, tracking and recognition systems. The dataset includes 80 videos (more than 850,000 frames) collected from 4 different Arabic news channels. An attempt was made to ensure maximum diversities of the textual content in terms of size, position and background. This data is accompanied by detailed annotations for each textbox. We also present a region-based text detection approach in addition to a set of evaluation protocols on which the performance of different systems can be measured.},
keywords={natural language processing;optical character recognition;text detection;video signal processing;AcTiV;Arabic news channels;artificial Arabic text detection system;artificial Arabic text recognition systems;artificial Arabic text tracking system;non-uniform evaluation metrics;private datasets;publicly accessible annotated video dataset;region-based text detection approach;textual content;video text detection;video text recognition;Ferroelectric films;High definition video;Manganese;Nonvolatile memory;Protocols;Random access memory;Arabic text;Benchmark;Video OCR;Video database},
doi={10.1109/ICDAR.2015.7333911},
month={Aug},
pdf={http://www.hennebert.org/download/publications/icdar-2015-a-dataset-for-arabic-text-detection-tracking-and-recognition-in-news-videos-activ.pdf},}
• D. Zufferey, T. Hofer, J. Hennebert, M. Schumacher, R. Ingold, and S. Bromuri, "Performance comparison of multi-label learning algorithms on clinical data for chronic diseases," Computers in Biology and Medicine, vol. 65, pp. 34-43, 2015.
[Bibtex]
@article{Zufferey201534,
Abstract = {We are motivated by the issue of classifying diseases of chronically ill patients to assist physicians in their everyday work. Our goal is to provide a performance comparison of state-of-the-art multi-label learning algorithms for the analysis of multivariate sequential clinical data from medical records of patients affected by chronic diseases. As a matter of fact, the multi-label learning approach appears to be a good candidate for modeling overlapped medical conditions, specific to chronically ill patients. With the availability of such comparison study, the evaluation of new algorithms should be enhanced. According to the method, we choose a summary statistics approach for the processing of the sequential clinical data, so that the extracted features maintain an interpretable link to their corresponding medical records. The publicly available MIMIC-II dataset, which contains more than 19,000 patients with chronic diseases, is used in this study. For the comparison we selected the following multi-label algorithms: ML-kNN, AdaBoostMH, binary relevance, classifier chains, \{HOMER\} and RAkEL. Regarding the results, binary relevance approaches, despite their elementary design and their independence assumption concerning the chronic illnesses, perform optimally in most scenarios, in particular for the detection of relevant diseases. In addition, binary relevance approaches scale up to large dataset and are easy to learn. However, the \{RAkEL\} algorithm, despite its scalability problems when it is confronted to large dataset, performs well in the scenario which consists of the ranking of the labels according to the dominant disease of the patient. },
Author = {Damien Zufferey and Thomas Hofer and Jean Hennebert and Michael Schumacher and Rolf Ingold and Stefano Bromuri},
Doi = {10.1016/j.compbiomed.2015.07.017},
Issn = {0010-4825},
Journal = {Computers in Biology and Medicine},
Keywords = {Multi-label learning},
Note = {Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.},
Pages = {34 - 43},
Title = {{P}erformance comparison of multi-label learning algorithms on clinical data for chronic diseases},
Volume = {65},
Year = {2015},
Bdsk-Url-2 = {http://dx.doi.org/10.1016/j.compbiomed.2015.07.017}}
• G. Bovet, G. Briard, and J. Hennebert, "A Scalable Cloud Storage for Sensor Networks," in 4th Int. Conf. on Internet of Things. IoT 2014 MIT, MA, USA. Web of Things Workshop, 2014.
[Bibtex]
@conference{bovet2014:wot2,
author = "G{\'e}r{\^o}me Bovet and Gautier Briard and Jean Hennebert",
abstract = "Data storage has become a major topic in sensor networks as large quantities of data need to be archived for future processing. In this paper, we present a cloud storage solution benefiting from the available memory on smart things becoming data nodes. In-network storage reduces the heavy traffic resulting of the transmission of all the data to an outside central sink. The system built on agents allows an autonomous management of the cloud and therefore requires no human in the loop. It also makes an intensive use of Web technologies to follow the clear trend of sensors adopting the Web-of-Things paradigm. Further, we make a performance evaluation demonstrating its suitability in building management systems.",
booktitle = "4th Int. Conf. on Internet of Things. IoT 2014 MIT, MA, USA. Web of Things Workshop",
doi = "10.1145/2684432.2684437",
isbn = "978-1-4503-3066-4",
keywords = "Distributed databases, cloud storage, web-of-things, sensor networks, internet-of-things",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
publisher = "ACM New York",
series = "International Conference on the Internet of Things - IoT 2014",
title = "{A} {S}calable {C}loud {S}torage for {S}ensor {N}etworks",
year = "2014",
}
• G. Bovet, A. Ridi, and J. Hennebert, "Toward Web Enhanced Building Automation Systems - Big Data and Internet of Things: A Roadmap for Smart Environments," , C. D. Nik Bessis, Ed., Springer, 2014, vol. 546, pp. 259-284.
[Bibtex]
@inbook{bovet:2014:bookchap,
Abstract = {The emerging concept of Smart Building relies on an intensive use of sensors and actuators and therefore appears, at first glance, to be a domain of predilection for the IoT. However, technology providers of building automation systems have been functioning, for a long time, with dedicated networks, communication protocols and APIs. Eventually, a mix of different technologies can even be present in a given building. IoT principles are now appearing in buildings as a way to simplify and standardise application development. Nevertheless, many issues remain due to this heterogeneity between existing installations and native IP devices that induces complexity and maintenance efforts of building management systems. A key success factor for the IoT adoption in Smart Buildings is to provide a loosely-coupled Web protocol stack allowing interoperation between all devices present in a building. We review in this chapter different strategies that are going in this direction. More specifically, we emphasise on several aspects issued from pervasive and ubiquitous computing like service discovery. Finally, making the assumption of seamless access to sensor data through IoT paradigms, we provide an overview of some of the most exciting enabling applications that rely on intelligent data analysis and machine learning for energy saving in buildings.},
Author = {G{\'e}r{\^o}me Bovet and Antonio Ridi and Jean Hennebert},
Chapter = {11},
Doi = {10.1007/978-3-319-05029-4_11},
Editor = {Nik Bessis, Ciprian Dobre},
Isbn = {9783319050287},
Keywords = {iot, wot, smart building},
Note = {http://www.springer.com/engineering/computational+intelligence+and+complexity/book/978-3-319-05028-7},
Pages = {259-284},
Publisher = {Springer},
Series = {Studies in Computational Intelligence},
Title = {{T}oward {W}eb {E}nhanced {B}uilding {A}utomation {S}ystems - {B}ig {D}ata and {I}nternet of {T}hings: {A} {R}oadmap for {S}mart {E}nvironments},
Volume = {546},
Year = {2014},
Bdsk-Url-2 = {http://dx.doi.org/10.1007/978-3-319-05029-4_11}}
• G. Bovet and J. Hennebert, "Will web technologies impact on building automation systems architecture?," in International Workshop on Enabling ICT for Smart Buildings (ICT-SB 2014), 2014, pp. 985-990.
[Bibtex]
@conference{bovet2014:ant,
Abstract = {Optimizationffices, factories and even private housings are more and more endowed with building management systems (BMS) targeting an increase of comfort as well as lowering energy costs. This expansion is made possible by the progress realized in pervasive computing, providing small sized and affordable sensing devices. However, current BMS are often based on proprietary technologies, making their interoperability and evolution more didcult. For example, we observe the emergence of new applications based on intelligent data analysis able to compute more complex models about the use of the building. Such applications rely on heterogeneous sets of sensors, web data, user feedback and self-learning algorithms. In this position paper, we discuss the role of Web technologies for standardizing the application layer, and thus providing a framework for developing advanced building applications. We present our vision of TASSo, a layered Web model facing actual and future challenges for building management systems.},
Author = {G{\'e}r{\^o}me Bovet and Jean Hennebert},
Booktitle = {International Workshop on Enabling ICT for Smart Buildings (ICT-SB 2014)},
Doi = {10.1016/j.procs.2014.05.522},
Issn = {1877-0509},
Keywords = {Building Management System, Internet-of-Things, Web-of-Things, Architecture},
Note = {Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.},
Pages = {985-990},
Series = {Procedia Computer Science},
Title = {{W}ill web technologies impact on building automation systems architecture?},
Volume = {32},
Year = {2014},
Bdsk-Url-2 = {http://dx.doi.org/10.1016/j.procs.2014.05.522}}
• G. Bovet and J. Hennebert, "Distributed Semantic Discovery for Web-of-Things Enabled Smart Buildings," in First International Workshop on Architectures and Technologies for Smart Cities, Dubai, United Arab Emirates, 2014, pp. 1-5.
[Bibtex]
@conference{bovet:ntms:2014,
Abstract = {Nowadays, our surrounding environment is more and more scattered with various types of sensors. Due to their intrinsic properties and representation formats, they form small islands isolated from each other. In order to increase interoperability and release their full capabilities, we propose to represent devices descriptions including data and service invocation with a common model allowing to compose mashups of heterogeneous sensors. Pushing this paradigm further, we also propose to augment service descriptions with a discovery protocol easing automatic assimilation of knowledge. In this work, we describe the architecture supporting what can be called a Semantic Sensor Web-of-Things. As proof of concept, we apply our proposal to the domain of smart buildings, composing a novel ontology covering heterogeneous sensing, actuation and service invocation. Our architecture also emphasizes on the energetic aspect and is optimized for constrained environments.},
Address = {Dubai, United Arab Emirates},
Author = {G{\'e}r{\^o}me Bovet and Jean Hennebert},
Booktitle = {First International Workshop on Architectures and Technologies for Smart Cities},
Doi = {10.1109/NTMS.2014.6814015},
Isbn = {9781479932245},
Keywords = {Smart buildings, Discovery, Semantics, Ontologies},
Month = {Mar},
Pages = {1-5},
Publisher = {IEEE},
Title = {{D}istributed {S}emantic {D}iscovery for {W}eb-of-{T}hings {E}nabled {S}mart {B}uildings},
Year = {2014},
Bdsk-Url-2 = {http://dx.doi.org/10.1109/NTMS.2014.6814015}}
• G. Bovet, G. Briard, and J. Hennebert, "A Scalable Cloud Storage for Sensor Networks," in 4th Int. Conf. on Internet of Things. IoT 2014 MIT, MA, USA. Web of Things Workshop, 2014.
[Bibtex]
@conference{bovet2014:wot2,
Abstract = {Data storage has become a major topic in sensor networks as large quantities of data need to be archived for future processing. In this paper, we present a cloud storage solution beneting from the available memory on smart things becoming data nodes. In-network storage reduces the heavy traffic resulting of the transmission of all the data to an outside central sink. The system built on agents allows an autonomous management of the cloud and therefore requires no human in the loop. It also makes an intensive use of Web technologies to follow the clear trend of sensors adopting the Web-of-Things paradigm. Further, we make a performance evaluation demonstrating its suitability in building management systems.},
Author = {G{\'e}r{\^o}me Bovet and Gautier Briard and Jean Hennebert},
Booktitle = {4th Int. Conf. on Internet of Things. IoT 2014 MIT, MA, USA. Web of Things Workshop},
Doi = {10.1145/2684432.2684437},
Isbn = {978-1-4503-3066-4},
Keywords = {Distributed databases, cloud storage, web-of-things, sensor networks, internet-of-things},
Note = {Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.},
Publisher = {ACM New York},
Series = {International Conference on the Internet of Things - IoT 2014},
Title = {{A} {S}calable {C}loud {S}torage for {S}ensor {N}etworks},
Year = {2014},
Bdsk-Url-2 = {http://dx.doi.org/10.1145/2684432.2684437}}
• G. Bovet, A. Ridi, and J. Hennebert, "Virtual Things for Machine Learning Applications:," in Fifth International Workshop on the Web of Things - WoT 2014, 2014.
[Bibtex]
@conference{bovet2014:wot,
Abstract = {Internet-of-Things (IoT) devices, especially sensors are producing large quantities of data that can be used for gathering knowledge. In this field, machine learning technologies are increasingly used to build versatile data-driven models. In this paper, we present a novel architecture able to execute machine learning algorithms within the sensor network, presenting advantages in terms of privacy and data transfer efficiency. We first argument that some classes of machine learning algorithms are compatible with this approach, namely based on the use of generative models that allow a distribution of the computation on a setof nodes. We then detail our architecture proposal, leveraging on the use of Web-of-Things technologies to ease integration into networks. The convergence of machine learning generative models and Web-of-Things paradigms leads us to the concept of virtual things exposing higher level knowledge by exploiting sensor data in the network. Finally, we demonstrate with a real scenario the feasibility and performances of our proposal.},
Author = {G{\'e}r{\^o}me Bovet and Antonio Ridi and Jean Hennebert},
Booktitle = {Fifth International Workshop on the Web of Things - WoT 2014},
Doi = {10.1145/2684432.2684434},
Isbn = {978-1-4503-3066-4},
Journal = {Fifth International Workshop on the Web of Things (WoT 2014)},
Keywords = {Machine learning, Sensor network, Web-of-Things, Internet-of-Things},
Month = {oct},
Note = {Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.},
Series = {International Conference on the Internet of Things - IoT 2014},
Title = {{V}irtual {T}hings for {M}achine {L}earning {A}pplications:},
Year = {2014},
Bdsk-Url-2 = {http://dx.doi.org/10.1145/2684432.2684434}}
• G. Bovet, A. Ridi, and J. Hennebert, "Appliance Recognition on Internet-of-Things Devices," in 4th Int. Conf. on Internet of Things. IoT 2014 MIT, MA, USA., 2014.
[Bibtex]
@conference{bovet2014:iotdemo,
author = "G{\'e}r{\^o}me Bovet and Antonio Ridi and Jean Hennebert",
abstract = "Machine Learning (ML) approaches are increasingly used to model data coming from sensor networks. Typical ML implementations are cpu intensive and are often running server-side. However, IoT devices provide increasing cpu capabilities and some classes of ML algorithms are compatible with distribution and downward scalability. In this demonstration we explore the possibility of distributing ML tasks to IoT devices in the sensor network. We demonstrate a concrete scenario of appliance recognition where a smart plug provides electrical measures that are distributed to WiFi nodes running the ML algorithms. Each node estimates class-conditional probabilities that are then merged for recognizing the appliance category. Finally, our architectures relies on Web technologies for complying with Web-of-Things paradigms.
",
booktitle = "4th Int. Conf. on Internet of Things. IoT 2014 MIT, MA, USA.",
keywords = "Internet-of-Things, Machine Learning, Appliance Recognition, NILM, Non Intrusive Load Monitoring, HMM, Hidden Markov Models",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
title = "{A}ppliance {R}ecognition on {I}nternet-of-{T}hings {D}evices",
year = "2014",
}
• G. Bovet and J. Hennebert, "A Distributed Web-based Naming System for Smart Buildings," in Third IEEE workshop on the IoT: Smart Objects and Services, Sydney, Australie, 2014, pp. 1-6.
[Bibtex]
@conference{bovet:hal-01022861,
author = "G{\'e}r{\^o}me Bovet and Jean Hennebert",
abstract = "Nowadays, pervasive application scenarios relying on sensor networks are gaining momentum. The field of smart buildings is a promising playground where the use of sensors allows a reduction of the overall energy consumption. Most of current applications are using the classical DNS which is not suited for the Internet-of-Things because of requiring humans to get it working. From another perspective, Web technologies are pushing in sensor networks following the Web-of-Things paradigm advocating to use RESTful APIs for manipulating resources representing device capabilities. Being aware of these two observations, we propose to build on top of Web technologies leading to a novel naming system that is entirely autonomous. In this work, we describe the architecture supporting what can be called an autonomous Web-oriented naming system. As proof of concept, we simulate a rather large building and compare the behaviour of our approach to the legacy DNS and Multicast DNS (mDNS).",
booktitle = "Third IEEE workshop on the IoT: Smart Objects and Services",
doi = "10.1109/WoWMoM.2014.6918930",
isbn = "9781479947850",
keywords = "iot, wot, smart building, web of things, internet of things",
month = "Jun",
pages = "1-6",
title = "{A} {D}istributed {W}eb-based {N}aming {S}ystem for {S}mart {B}uildings",
year = "2014",
}
• S. Bromuri, D. Zufferey, J. Hennebert, and M. Schumacher, "Multi-Label Classification of Chronically Ill Patients with Bag of Words and Supervised Dimensionality Reduction Algorithms," Journal of Biomedical Informatics, vol. 54, p. 165.175, 2014.
[Bibtex]
@article{brom:jbi:2014,
author = "Stefano Bromuri and Damien Zufferey and Jean Hennebert and Michael Schumacher",
abstract = "Objective.
This research is motivated by the issue of classifying illnesses of chronically ill patients for decision support in clinical settings. Our main objective is to propose multi-label classification of multivariate time series contained in medical records of chronically ill patients, by means of quantization methods, such as bag of words (BoW), and multi-label classification algorithms. Our second objective is to compare supervised dimensionality reduction techniques to state-of-the-art multi-label classification algorithms. The hypothesis is that kernel methods and locality preserving projections make such algorithms good candidates to study multi-label medical time series.
Methods.
We combine BoW and supervised dimensionality reduction algorithms to perform multi-label classification on health records of chronically ill patients. The considered algorithms are compared with state-of-the-art multi-label classifiers in two real world datasets. Portavita dataset contains 525 diabetes type 2 (DT2) patients, with co-morbidities of DT2 such as hypertension, dyslipidemia, and microvascular or macrovascular issues. MIMIC II dataset contains 2635 patients affected by thyroid disease, diabetes mellitus, lipoid metabolism disease, fluid electrolyte disease, hypertensive disease, thrombosis, hypotension, chronic obstructive pulmonary disease (COPD), liver disease and kidney disease. The algorithms are evaluated using multi-label evaluation metrics such as hamming loss, one error, coverage, ranking loss, and average precision.
Results.
Non-linear dimensionality reduction approaches behave well on medical time series quantized using the BoW algorithm, with results comparable to state-of-the-art multi-label classification algorithms. Chaining the projected features has a positive impact on the performance of the algorithm with respect to pure binary relevance approaches.
Conclusions.
The evaluation highlights the feasibility of representing medical health records using the BoW for multi-label classification tasks. The study also highlights that dimensionality reduction algorithms based on kernel methods, locality preserving projections or both are good candidates to deal with multi-label classification tasks in medical time series with many missing values and high label density.",
doi = "10.1016/j.jbi.2014.05.010",
issn = "15320464",
journal = "Journal of Biomedical Informatics",
keywords = "Machine Learning, Multi-label classification, Complex patient, Diabetes type 2, Clinical data, Dimensionality reduction, Kernel methods",
month = "2014/05/30",
pages = "165.175",
publisher = "Elsevier",
title = "{M}ulti-{L}abel {C}lassification of {C}hronically {I}ll {P}atients with {B}ag of {W}ords and {S}upervised {D}imensionality {R}eduction {A}lgorithms",
url = "http://www.j-biomed-inform.com/article/S1532-0464(14)00127-0/abstract",
volume = "54",
year = "2014",
}
• K. Chen, H. Wei, J. Hennebert, R. Ingold, and M. Liwicki, "Page Segmentation for Historical Handwritten Document Images Using Color and Texture Features," in International Conference on Frontiers in Handwriting Recognition (ICFHR), 2014, pp. 488-493.
[Bibtex]
@conference{chen2014:ICFHR,
Abstract = {In this paper we present a physical structure detection method for historical handwritten document images. We considered layout analysis as a pixel labeling problem. By classifying each pixel as either periphery, background, text block, or decoration, we achieve high quality segmentation without any assumption of specific topologies and shapes. Various color and texture features such as color variance, smoothness, Laplacian, Local Binary Patterns, and Gabor Dominant Orientation Histogram are used for classification. Some of these features have so far not got many attentions for document image layout analysis. By applying an Improved Fast Correlation-Based Filter feature selection algorithm, the redundant and irrelevant features are removed. Finally, the segmentation results are refined by a smoothing post-processing procedure. The proposed method is demonstrated by experiments conducted on three different historical handwritten document image datasets. Experiments show the benefit of combining various color and texture features for classification. The results also show the advantage of using a feature selection method to choose optimal feature subset. By applying the proposed method we achieve superior accuracy compared with earlier work on several datasets, e.g., We achieved 93% accuracy compared with 91% of the previous method on the Parzival dataset which contains about 100 million pixels.},
Author = {Kai Chen and Hao Wei and Jean Hennebert and Rolf Ingold and Marcus Liwicki},
Booktitle = {International Conference on Frontiers in Handwriting Recognition (ICFHR)},
Doi = {10.1109/ICFHR.2014.88},
Isbn = {9781479978922},
Keywords = {machine learning, image analysis},
Pages = {488-493},
Publisher = {Institute of Electrical and Electronics Engineers ( IEEE )},
Title = {{P}age {S}egmentation for {H}istorical {H}andwritten {D}ocument {I}mages {U}sing {C}olor and {T}exture {F}eatures},
Year = {2014},
Bdsk-Url-2 = {http://dx.doi.org/10.1109/ICFHR.2014.88}}
• K. Chen, H. Wei, M. Liwicki, J. Hennebert, and R. Ingold, "Robust Text Line Segmentation for Historical Manuscript Images Using Color and Texture," in 22nd International Conference on Pattern Recognition - ICPR, 2014, pp. 2978-2983.
[Bibtex]
@conference{chen2014:icpr,
Abstract = {In this paper we present a novel text line segmentation method for historical manuscript images. We use a pyramidal approach where at the first level, pixels are classified into: text, background, decoration, and out of page; at the second level, text regions are split into text line and non text line. Color and texture features based on Local Binary Patterns and Gabor Dominant Orientation are used for classification. By applying a modified Fast Correlation-Based Filter feature selection algorithm, redundant and irrelevant features are removed. Finally, the text line segmentation results are refined by a smoothing post-processing procedure. Unlike other projection profile or connected components methods, the proposed algorithm does not use any script-specific knowledge and is applicable to color images. The proposed algorithm is evaluated on three historical manuscript image datasets of diverse nature and achieved an average precision of 91% and recall of 84%. Experiments also show that the proposed algorithm is robust with respect to changes of the writing style, page layout, and noise on the image.},
Author = {Kai Chen and Hao Wei and Marcus Liwicki and Jean Hennebert and Rolf Ingold},
Booktitle = {22nd International Conference on Pattern Recognition - ICPR},
Doi = {10.1109/ICPR.2014.514},
Isbn = {9781479952106},
Keywords = {Machine Learning, Document Understanding, Segmentation, features and descriptors, Texture and color analysis},
Note = {Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.},
Pages = {2978-2983},
Publisher = {Institute of Electrical and Electronics Engineers ( IEEE )},
Title = {{R}obust {T}ext {L}ine {S}egmentation for {H}istorical {M}anuscript {I}mages {U}sing {C}olor and {T}exture},
Year = {2014},
Bdsk-Url-2 = {http://dx.doi.org/10.1109/ICPR.2014.514}}
• K. Chen and J. Hennebert, "Content-Based Image Retrieval with LIRe and SURF on a Smart-phone-Based Product Image Database," in Pattern Recognition, J. Martinez-Trinidad, J. Carrasco-Ochoa, J. Olvera-Lopez, J. Salas-Rodriguez, and C. Suen, Eds., Springer International Publishing, 2014, pp. 231-240.
[Bibtex]
@incollection{chen2014:mcpr,
Abstract = {We present the evaluation of a product identification task using the LIRe system and SURF (Speeded-Up Robust Features) for content-based image retrieval (CBIR). The evaluation is performed on the Fribourg Product Image Database (FPID) that contains more than 3'000 pictures of consumer products taken using mobile phone cameras in realistic conditions. Using the evaluation protocol proposed with FPID, we explore the performance of different prepro- cessing and feature extraction. We observe that by using SURF, we can improve significantly the performance on this task. Image resizing and Lucene indexing are used in order to speed up CBIR task with SURF. We also show the benefit of using simple preprocessing of the images such as a proportional cropping of the images. The experiments demonstrate the effectiveness of the proposed method for the product identification task.},
Author = {Kai Chen and Jean Hennebert},
Booktitle = {Pattern Recognition},
Doi = {10.1007/978-3-319-07491-7_24},
Editor = {Martinez-Trinidad, Jos{\'e}Francisco and Carrasco-Ochoa, Jes{\'u}sAriel and Olvera-Lopez, Jos{\'e}Arturo and Salas-Rodriguez, Joaquin and Suen, ChingY.},
Isbn = {9783319074900},
Keywords = {cbir, image recognition, machine learning, product identification, smartphone-based iimage database, fpid, benchmarking},
Note = {Lecture Notes in Computer Science. 6th Mexican Conference on Pattern Recognition (MCPR2014)},
Pages = {231-240},
Publisher = {Springer International Publishing},
Title = {{C}ontent-{B}ased {I}mage {R}etrieval with {LIR}e and {SURF} on a {S}mart-phone-{B}ased {P}roduct {I}mage {D}atabase},
Year = {2014},
Bdsk-Url-2 = {http://dx.doi.org/10.1007/978-3-319-07491-7_24}}
• C. Gisler, A. Ridi, M. Fauquey, D. Genoud, and J. Hennebert, "Towards Glaucoma Detection Using Intraocular Pressure Monitoring," in The 6th International Conference of Soft Computing and Pattern Recognition (SoCPaR 2014), 2014, pp. 255-260.
[Bibtex]
@conference{gisler2014:socpar,
Abstract = {Diagnosing the glaucoma is a very difficult task for healthcare professionals. High intraocular pressure (IOP) remains the main treatable symptom of this degenerative disease which leads to blindness. Nowadays, new types of wearable sensors, such as the contact lens sensor Triggerfish{\textregistered}, provide an automated recording of 24-hour profile of ocular dimensional changes related to IOP. Through several clinical studies, more and more IOP-related profiles have been recorded by those sensors and made available for elaborating data-driven experiments. The objective of such experiments is to analyse and detect IOP pattern differences between ill and healthy subjects. The potential is to provide medical doctors with analysis and detection tools allowing them to better diagnose and treat glaucoma. In this paper we present the methodologies, signal processing and machine learning algorithms elaborated in the task of automated detection of glaucomatous IOP- related profiles within a set of 100 24-hour recordings. As first convincing results, we obtained a classification ROC AUC of 81.5%.},
Author = {Christophe Gisler and Antonio Ridi and Mil{\e}ne Fauquey and Dominique Genoud and Jean Hennebert},
Booktitle = {The 6th International Conference of Soft Computing and Pattern Recognition (SoCPaR 2014)},
Doi = {10.1109/SOCPAR.2014.7008015},
Isbn = {9781479959358},
Keywords = {Biomedical signal processing, Glaucoma diagnosis, Machine learning},
Pages = {255-260},
Publisher = {Institute of Electrical and Electronics Engineers ( IEEE )},
Title = {{T}owards {G}laucoma {D}etection {U}sing {I}ntraocular {P}ressure {M}onitoring},
Year = {2014},
Bdsk-Url-2 = {http://dx.doi.org/10.1109/SOCPAR.2014.7008015}}
• A. Ridi, C. Gisler, and J. Hennebert, "ACS-F2 - A new database of appliance consumption signatures," in Soft Computing and Pattern Recognition (SoCPaR), 2014 6th International Conference of, 2014, pp. 145-150.
[Bibtex]
@conference{ridi2014socpar,
Abstract = {We present ACS-F2, a new electric consumption signature database acquired from domestic appliances. The scenario of use is appliance identification with emerging applications such as domestic electricity consumption understanding, load shedding management and indirect human activity monitoring. The novelty of our work is to use low-end electricity consumption sensors typically located at the plug. Our approach consists in acquiring signatures at a low frequency, which contrast with high frequency transient analysis approaches that are costlier and have been well studied in former research works. Electrical consumption signatures comprise real power, reactive power, RMS current, RMS voltage, frequency and phase of voltage relative to current. A total of 225 appliances were recorded over two sessions of one hour. The database is balanced with 15 different brands/models spread into 15 categories. Two realistic appliance recognition protocols are proposed and the database is made freely available to the scientific community for the experiment reproducibility. We also report on recognition results following these protocols and using baseline recognition algorithms like k-NN and GMM.},
Author = {A. Ridi and C. Gisler and J. Hennebert},
Booktitle = {Soft Computing and Pattern Recognition (SoCPaR), 2014 6th International Conference of},
Doi = {10.1109/SOCPAR.2014.7007996},
Isbn = {9781479959358},
Keywords = {machine learning, electric signal, appliance signatures},
Month = {Aug},
Pages = {145-150},
Publisher = {IEEE},
Title = {{ACS}-{F}2 - {A} new database of appliance consumption signatures},
Year = {2014},
Bdsk-Url-2 = {http://dx.doi.org/10.1109/SOCPAR.2014.7007996}}
• A. Ridi, C. Gisler, and J. Hennebert, "Appliance and State Recognition using Hidden Markov Models," in The 2014 International Conference on Data Science and Advanced Analytics (DSAA 2014), Shangai, China, 2014, pp. 270-276.
[Bibtex]
@conference{ridi2014dsaa,
Abstract = {We asset about the analysis of electrical appliance consumption signatures for the identification task. We apply Hidden Markov Models to appliance signatures for the identification of their category and of the most probable sequence of states. The electrical signatures are measured at low frequency (101 Hz) and are sourced from a specific database. We follow two predefined protocols for providing comparable results. Recovering information on the actual appliance state permits to potentially adopt energy saving measures, as switching off stand-by appliances or, generally speaking, changing their state. Moreover, in most of the cases appliance states are related to user activities: the user interaction usually involves a transition of the appliance state. Information about the state transition could be useful in Smart Home / Building Systems to reduce energy consumption and increase human comfort. We report the results of the classification tasks in terms of confusion matrices and accuracy rates. Finally, we present our application for a real-time data visualization and the recognition of the appliance category with its actual state.},
Author = {Antonio Ridi and Christophe Gisler and Jean Hennebert},
Booktitle = {The 2014 International Conference on Data Science and Advanced Analytics (DSAA 2014)},
Doi = {10.1109/DSAA.2014.7058084},
Isbn = {9781479969821},
Keywords = {Appliance Identification, Appliance State Recognition, Intrusive Load Monitoring, ILM},
Month = {10/2014},
Note = {Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.},
Pages = {270-276},
Publisher = {Institute of Electrical and Electronics Engineers ( IEEE )},
Title = {{A}ppliance and {S}tate {R}ecognition using {H}idden {M}arkov {M}odels},
Year = {2014},
Bdsk-Url-2 = {http://dx.doi.org/10.1109/DSAA.2014.7058084}}
• A. Ridi and J. Hennebert, "Hidden Markov Models for ILM Appliance Identification," in The 5th International Conference on Ambient Systems, Networks and Technologies (ANT-2014), the 4th International Conference on Sustainable Energy Information Technology (SEIT-2014), 2014, p. 1010–1015.
[Bibtex]
@conference{ridi2014:ant,
Abstract = {The automatic recognition of appliances through the monitoring of their electricity consumption finds many applications in smart buildings. In this paper we discuss the use of Hidden Markov Models (HMMs) for appliance recognition using so-called intrusive load monitoring (ILM) devices. Our motivation is found in the observation of electric signatures of appliances that usually show time varying profiles depending to the use made of the appliance or to the intrinsic internal operating of the appliance. To determine the benefit of such modelling, we propose a comparison of stateless modelling based on Gaussian mixture models and state-based models using Hidden Markov Models. The comparison is run on the publicly available database ACS-F1. We also compare differ- ent approaches to determine the best model topologies. More specifically we compare the use of a priori information on the device, a procedure based on a criteria of log-likelihood maximization and a heuristic approach.},
Author = {Antonio Ridi and Jean Hennebert},
Booktitle = {The 5th International Conference on Ambient Systems, Networks and Technologies (ANT-2014), the 4th International Conference on Sustainable Energy Information Technology (SEIT-2014)},
Doi = {10.1016/j.procs.2014.05.526},
Issn = {1877-0509},
Keywords = {Hidden Markov Models, appliance recognition, Intrusive Load Monitoring},
Note = {Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.},
Pages = {1010--1015},
Series = {Procedia Computer Science},
Title = {{H}idden {M}arkov {M}odels for {ILM} {A}ppliance {I}dentification},
Volume = {32},
Year = {2014},
Bdsk-Url-2 = {http://dx.doi.org/10.1016/j.procs.2014.05.526}}
• A. Ridi, C. Gisler, and J. Hennebert, "A Survey on Intrusive Load Monitoring for Appliance Recognition," in 22nd International Conference on Pattern Recognition - ICPR, 2014, pp. 3702-3707.
[Bibtex]
@conference{ridi2014:icpr,
Abstract = {Electricity load monitoring of appliances has be- come an important task considering the recent economic and ecological trends. In this game, machine learning has an important part to play, allowing for energy consumption understanding, critical equipment monitoring and even human activity recognition. This paper provides a survey of current researches on Intrusive Load Monitoring (ILM) techniques. ILM relies on low- end electricity meter devices spread inside the habitations, as opposed to Non-Intrusive Load Monitoring (NILM) that relies on an unique point of measurement, the smart meter. Potential applications and principles of ILMs are presented and compared to NILM. A focus is also given on feature extraction and machine learning algorithms typically used for ILM applications.},
Author = {Antonio Ridi and Christophe Gisler and Jean Hennebert},
Booktitle = {22nd International Conference on Pattern Recognition - ICPR},
Doi = {10.1109/ICPR.2014.636},
Isbn = {9781479952106},
Keywords = {Machine Learning, Intrusive Load Monitoring, ILM, IT for efficiency, Green Computing},
Month = {August},
Note = {Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.},
Organization = {IEEE},
Pages = {3702-3707},
Title = {{A} {S}urvey on {I}ntrusive {L}oad {M}onitoring for {A}ppliance {R}ecognition},
Year = {2014},
Bdsk-Url-2 = {http://dx.doi.org/10.1109/ICPR.2014.636}}
• A. Ridi, C. Gisler, and J. Hennebert, "A Survey on Intrusive Load Monitoring for Appliance Recognition," in 22nd International Conference on Pattern Recognition - ICPR, 2014, pp. 3702-3707.
[Bibtex]
@conference{ridi2014:icpr,
author = "Antonio Ridi and Christophe Gisler and Jean Hennebert",
abstract = "Electricity load monitoring of appliances has be- come an important task considering the recent economic and ecological trends. In this game, machine learning has an important part to play, allowing for energy consumption understanding, critical equipment monitoring and even human activity recognition. This paper provides a survey of current researches on Intrusive Load Monitoring (ILM) techniques. ILM relies on low- end electricity meter devices spread inside the habitations, as opposed to Non-Intrusive Load Monitoring (NILM) that relies on an unique point of measurement, the smart meter. Potential applications and principles of ILMs are presented and compared to NILM. A focus is also given on feature extraction and machine learning algorithms typically used for ILM applications.",
booktitle = "22nd International Conference on Pattern Recognition - ICPR",
doi = "10.1109/ICPR.2014.636",
isbn = "9781479952106",
keywords = "Machine Learning, Intrusive Load Monitoring, ILM, IT for efficiency, Green Computing",
month = "August",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
organization = "IEEE",
pages = "3702-3707",
title = "{A} {S}urvey on {I}ntrusive {L}oad {M}onitoring for {A}ppliance {R}ecognition",
year = "2014",
}
• B. Wicht and J. Hennebert, "Camera-based Sudoku Recognition with Deep Belief Network," in 2014 6th International Conference of Soft Computing and Pattern Recognition (SoCPaR 2014), 2014, pp. 83-88.
[Bibtex]
@conference{wicht2014:socpar,
Abstract = {In this paper, we propose a method to detect and recognize a Sudoku puzzle on images taken from a mobile camera. The lines of the grid are detected with a Hough transform. The grid is then recomposed from the lines. The digits position are extracted from the grid and finally, each character is recognized using a Deep Belief Network (DBN). To test our implementation, we collected and made public a dataset of Sudoku images coming from cell phones. Our method proved successful on our dataset, achieving 87.5% of correct detection on the testing set. Only 0.37% of the cells were incorrectly guessed. The algorithm is capable of handling some alterations of the images, often present on phone-based images, such as distortion, perspective, shadows, illumination gradients or scaling. On average, our solution is able to produce a result from a Sudoku in less than 100ms.},
Author = {Baptiste Wicht and Jean Hennebert},
Booktitle = {2014 6th International Conference of Soft Computing and Pattern Recognition (SoCPaR 2014)},
Doi = {10.1109/SOCPAR.2014.7007986},
Isbn = {9781479959358},
Keywords = {Machine Learning, DBN, Deep Belief Network, Image Recognition, Text Detection, Text Recognition},
Pages = {83-88},
Publisher = {Institute of Electrical and Electronics Engineers ( IEEE )},
Title = {{C}amera-based {S}udoku {R}ecognition with {D}eep {B}elief {N}etwork},
Year = {2014},
Bdsk-Url-2 = {http://dx.doi.org/10.1109/SOCPAR.2014.7007986}}
• O. Zayene, S. M. Touj, J. Hennebert, R. Ingold, and B. N. E. Amara, "Semi-automatic news video annotation framework for Arabic text," in 2014 4th International Conference on Image Processing Theory, Tools and Applications (IPTA), 2014, pp. 1-6.
[Bibtex]
@INPROCEEDINGS{zayene2014:ipta,
author={O. Zayene and S. M. Touj and J. Hennebert and R. Ingold and N. E. Ben Amara},
booktitle={2014 4th International Conference on Image Processing Theory, Tools and Applications (IPTA)},
title={Semi-automatic news video annotation framework for Arabic text},
year={2014},
pages={1-6},
abstract={In this paper, we present a semi-automatic news video annotation tool. The tool and its algorithms are dedicated to artificial Arabic text embedded in video news in the form of static text as well as scrolling one. It is performed at two different levels. Including specificities of Arabic script, the tool manages a global level which concerns the entire video and a local level which concerns any specific frame extracted from the video. The global annotation is performed manually thanks to a user interface. As a result of this step, we obtain the global xml file. The local annotation at the frame level is done automatically according to the information contained in the global metafile and a proposed text tracking algorithm. The main application of our tool is the ground truthing of textual information in video content. It is being used for this purpose in the Arabic Text in Video (AcTiV) database project in our lab. One of the functions that AcTiV provides, is a benchmark to compare existing and future Arabic video OCR systems.},
keywords={XML;electronic publishing;natural language processing;text analysis;video signal processing;visual databases;AcTiV database;Arabic script;Arabic text in video database project;Arabic video OCR systems;artificial Arabic text;global XML file;global annotation;global level;global meta file;ground truthing;local level;scrolling text;semiautomatic news video annotation framework;static text;text tracking algorithm;textual information;user interface;Databases;Educational institutions;Heuristic algorithms;Optical character recognition software;Streaming media;User interfaces;XML;Benchmarking VideoOCR systems;annotation;artificial Arabic text;data sets},
doi={10.1109/IPTA.2014.7001963},
ISSN={2154-5111},
month={Oct},
pdf={http://www.hennebert.org/download/publications/ipta-2014-semi-automatic-news-video-annotation-framework-for-arabic-text.pdf},}
• G. Bovet and J. Hennebert, "Web-of-Things Gateways for KNX and EnOcean Networks," in International Conference on Cleantech for Smart Cities & Buildings from Nano to Urban Scale (CISBAT 2013), 2013, pp. 519-524.
[Bibtex]
@conference{bovet2013:cisbat,
author = "G{\'e}r{\^o}me Bovet and Jean Hennebert",
abstract = "Smart buildings tend to democratize both in new and renovated constructions aiming at minimizing energy consumption and maximizing comfort. They rely on dedicated networks of sensors and actuators orchestrated by management systems. Those systems tend to migrate from simple reactive control to complex predictive systems using self- learning algorithms requiring access to history data. The underlying building networks are often heterogeneous, leading to complex software systems having to implement all the available protocols and resulting in low system integration and heavy maintenance efforts. Typical building networks offer no common standardized application layer for building applications. This is not only true for data access but also for functionality discovery. They base on specific protocols for each technology, that are requiring expert knowledge when building software applications on top of them. The emerging Web-of-Things (WoT) framework, using well-known technologies like HTTP and RESTful APIs to offer a simple and homogeneous application layer must be considered as a strong candidate for standardization purposes. In this work, we defend the position that the WoT framework is an excellent candidate to elaborate next generation BMS systems, mainly due to the simplicity and universality of the telecommunication and application protocols. Further to this, we investigate the possibility to implement a gateway allowing access to devices connected to KNX and EnOcean networks in a Web-of-Things manner. By taking advantage of the bests practices of the WoT, we show the possibility of a fast integration of KNX in every control system. The elaboration of WoT gateways for EnOcean network presents further challenges that are described in the paper, essentially due to optimization of the underlying communication protocol.",
booktitle = "International Conference on Cleantech for Smart Cities {{\&}} Buildings from Nano to Urban Scale (CISBAT 2013)",
keywords = "IT for Sustainability, Smart Buildings, Web-of-Things, RESTful, KNX, EnOcean, Gateways",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
pages = "519-524",
title = "{W}eb-of-{T}hings {G}ateways for {KNX} and {E}n{O}cean {N}etworks",
year = "2013",
}
• G. Bovet and J. Hennebert, "Energy-Efficient Optimization Layer for Event-Based Communications on Wi-Fi Things," Procedia Computer Science, vol. 19, pp. 256-264, 2013.
[Bibtex]
@article{bovet:2013:ant:procedia,
author = "G{\'e}r{\^o}me Bovet and Jean Hennebert",
abstract = "The Web-of-Things or WoT offers a way to standardize the access to services embedded on everyday objects, leveraging on well accepted standards of the Web such as HTTP and REST services. The WoT offers new ways to build mashups of object services, notably in smart buildings composed of sensors and actuators. Many things are now taking advantage of the progresses of embedded systems relying on the ubiquity of Wi-Fi networks following the 802.11 standards. Such things are often battery powered and the question of energy efficiency is therefore critical. In our research, we believe that several optimizations can be applied in the application layer to optimize the energy consumption of things. More specifically in this paper, we propose an hybrid layer automatically selecting the most appropriate communication protocol between current standards of WoT. Our results show that indeed not all protocols are equivalent in terms of energy consumption, and that some noticeable energy saves can be achieved by using our hybrid layer. ",
doi = "/10.1016/j.procs.2013.06.037",
issn = "1877-0509",
journal = "Procedia Computer Science ",
keywords = "Web-of-Things, RESTful services, WebSockets, CoAP, Energy efficiency, Smart buildings",
note = "The 4th International Conference on Ambient Systems, Networks and Technologies (ANT 2013), the 3rd International Conference on Sustainable Energy Information Technology (SEIT-2013).
Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
pages = "256-264",
title = "{E}nergy-{E}fficient {O}ptimization {L}ayer for {E}vent-{B}ased {C}ommunications on {W}i-{F}i {T}hings ",
volume = "19",
year = "2013",
}
• G. Bovet and J. Hennebert, "An Energy Efficient Layer for Event-Based Communications in Web-of-Things Frameworks," in The 7th FTRA International Conference on Multimedia and Ubiquitous Engineering (MUE 2013), Springer - Lecture Notes in Electrical Engineering, 2013, vol. 240, pp. 93-101.
[Bibtex]
@inbook{bovet:2013:mue,
author = "G{\'e}r{\^o}me Bovet and Jean Hennebert",
abstract = "Leveraging on the Web-of-Things (WoT) allows standardizing the access of things from an application level point of view. The protocols of the Web and especially HTTP are offering new ways to build mashups of things consisting of sensors and actuators. Two communication protocols are now emerging in the WoT domain for event-based data exchange, namely WebSockets and RESTful APIs. In this work, we motivate and demonstrate the use of a hybrid layer able to choose dynamically the most energy efficient protocol.",
booktitle = "The 7th FTRA International Conference on Multimedia and Ubiquitous Engineering (MUE 2013)",
doi = "10.1007/978-94-007-6738-6_12",
isbn = "9789400767379",
keywords = "Web-of-Things, RESTful services, WebSockets",
month = "May",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = "3",
pages = "93-101",
publisher = "Springer - Lecture Notes in Electrical Engineering",
series = "Multimedia and Ubiquitous Engineering",
title = "{A}n {E}nergy {E}fficient {L}ayer for {E}vent-{B}ased {C}ommunications in {W}eb-of-{T}hings {F}rameworks",
volume = "240",
year = "2013",
}
• K. Chen and J. Hennebert, "The Fribourg Product Image Database for Product Identification Tasks," in Proceedings of the 1st IEEE/IIAE International Conference on Intelligent Systems and Image Processing, 2013, pp. 162-169.
[Bibtex]
@conference{chen2013:icisip,
author = "Kai Chen and Jean Hennebert",
abstract = "We present in this paper a new database containing images of end-consumer products. The database currently contains more than 3'000 pictures of products taken exclusively using mobile phones. We focused the acquisition on 3 families of product: water bottles, chocolate and coffee. Nine mobile phones have been used and about 353 different products are available. Pictures are taken in real-life conditions, i.e. directly in the shops and without controlling the illumination, centering of the product or removing the background. Each image is provided with ground truth information including the product label, mobile phone brand and series as well as region of interest in the images. The database is made freely available for the scientific community and can be used for content-based image retrieval benchmark dataset or verification tasks.",
booktitle = "Proceedings of the 1st IEEE/IIAE International Conference on Intelligent Systems and Image Processing",
doi = "10.12792/icisip2013.033",
keywords = "CBIR, image retrieval, image database, FPID, benchmarking, product identification, machine learning",
month = "September",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
pages = "162-169",
title = "{T}he {F}ribourg {P}roduct {I}mage {D}atabase for {P}roduct {I}dentification {T}asks",
year = "2013",
}
• G. Bovet and J. Hennebert, "Offering Web-of-things Connectivity to Building Networks," in Proceedings of the 2013 ACM Conference on Pervasive and Ubiquitous Computing Adjunct Publication, New York, NY, USA, 2013, pp. 1555-1564.
[Bibtex]
@conference{bovet2013:wot,
author = "G{\'e}r{\^o}me Bovet and Jean Hennebert",
abstract = "Building management systems (BMS) are nowadays present in new and renovated buildings, relying on dedicated networks. The presence of various building networks leads to problems of heterogeneity, especially for developing BMS. In this paper, we propose to leverage on the Web-of-Things (WoT) framework, using well-known standard technologies of the Web like HTTP and RESTful APIs for standardizing the access to devices seen from an application point of view. We present the implementation of two gateways using the WoT approach for exposing KNX and EnOcean device capabilities as Web services, allowing a fast integration in existing and new management systems.",
address = "New York, NY, USA",
booktitle = "Proceedings of the 2013 ACM Conference on Pervasive and Ubiquitous Computing Adjunct Publication",
doi = "10.1145/2494091.2497590",
isbn = "9781450322157",
keywords = "building networks, enocean, gateways, knx, web-of-things",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
pages = "1555-1564",
publisher = "ACM",
title = "{O}ffering {W}eb-of-things {C}onnectivity to {B}uilding {N}etworks",
year = "2013",
}
• C. Gisler, A. Ridi, and J. Hennebert, "Appliance Consumption Signature Database and Recognition Test Protocols," in WOSSPA2013 The 9th International Workshop on Systems, Signal Processing and their Applications 2013, 2013, pp. 336-341.
[Bibtex]
@conference{gisler:2013:wosspa,
author = "Christophe Gisler and Antonio Ridi and Jean Hennebert",
abstract = "We report on the creation of a database of appliance consumption signatures and two test protocols to be used for appliance recognition tasks. By means of plug-based low-end sensors measuring the electrical consumption at low frequency, typically every 10 seconds, we made two acquisition sessions of one hour on about 100 home appliances divided into 10 categories: mobile phones (via chargers), coffee machines, computer stations (including monitor), fridges and freezers, Hi-Fi systems (CD players), lamp (CFL), laptops (via chargers), microwave oven, printers, and televisions (LCD or LED). We measured their consumption in terms of real power (W), reactive power (var), RMS current (A) and phase of voltage relative to current (varphi). We plan to give free access to the database for the whole scientific community. The proposed test protocols will help to objectively compare new algorithms. ",
booktitle = "WOSSPA2013 The 9th International Workshop on Systems, Signal Processing and their Applications 2013",
doi = "10.1109/WoSSPA.2013.6602387",
isbn = "9781467355407",
keywords = "electric consumption modelling, benchmark protocols",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
pages = "336-341",
title = "{A}ppliance {C}onsumption {S}ignature {D}atabase and {R}ecognition {T}est {P}rotocols",
year = "2013",
}
• A. Ridi, N. Zarkadis, G. Bovet, N. Morel, and J. Hennebert, "Towards Reliable Stochastic Data-Driven Models Applied to the Energy Saving in Buildings," in International Conference on Cleantech for Smart Cities & Buildings from Nano to Urban Scale (CISBAT 2013), 2013, pp. 501-506.
[Bibtex]
@conference{ridi2013:cisbat,
author = "Antonio Ridi and Nikos Zarkadis and G{\'e}r{\^o}me Bovet and Nicolas Morel and Jean Hennebert",
abstract = "We aim at the elaboration of Information Systems able to optimize energy consumption in buildings while preserving human comfort. Our focus is in the use of state-based stochas- tic modeling applied to temporal signals acquired from heterogeneous sources such as distributed sensors, weather web services, calendar information and user triggered events. Our general scientific objectives are: (1) global instead of local optimization of building automation sub-systems (heating, ventilation, cooling, solar shadings, electric lightings), (2) generalization to unseen building configuration or usage through self-learning data- driven algorithms and (3) inclusion of stochastic state-based modeling to better cope with seasonal and building activity patterns. We leverage on state-based models such as Hidden Markov Models (HMMs) to be able to capture the spatial (states) and temporal (sequence of states) characteristics of the signals. We envision several application layers as per the intrinsic nature of the signals to be modeled. We also envision room-level systems able to leverage on a set of distributed sensors (temperature, presence, electricity consumption, etc.). A typical example of room-level system is to infer room occupancy information or activities done in the rooms as a function of time. Finally, building-level systems can be composed to infer global usage and to propose optimization strategies for the building as a whole. In our approach, each layer may be fed by the output of the previous layers.
More specifically in this paper, we report on the design, conception and validation of several machine learning applications. We present three different applications of state-based modeling. In the first case we report on the identification of consumer appliances through an analysis of their electric loads. In the second case we perform the activity recognition task, representing human activities through state-based models. The third case concerns the season prediction using building data, building characteristic parameters and meteorological data.",
booktitle = "International Conference on Cleantech for Smart Cities {{\&}} Buildings from Nano to Urban Scale (CISBAT 2013)",
keywords = "IT for Sustainability, Smart Buildings, Machine Learning",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
pages = "501-506",
title = "{T}owards {R}eliable {S}tochastic {D}ata-{D}riven {M}odels {A}pplied to the {E}nergy {S}aving in {B}uildings",
year = "2013",
}
• A. Ridi, C. Gisler, and J. Hennebert, "Unseen Appliances Identification," in Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, 2013, p. 75–82.
[Bibtex]
@conference{ridi2013:ciarp,
author = "Antonio Ridi and Christophe Gisler and Jean Hennebert",
abstract = "We assess the feasibility of unseen appliance recognition through the analysis of their electrical signatures recorded using low-cost smart plugs. By unseen, we stress that our approach focuses on the identification of appliances that are of different brands or models than the one in training phase. We follow a strictly defined protocol in order to provide comparable results to the scientific community. We first evaluate the drop of performance when going from seen to unseen appliances. We then analyze the results of different machine learning algorithms, as the k-Nearest Neighbor (k-NN) and Gaussian Mixture Models (GMMs). Several tunings allow us to achieve 74% correct accuracy using GMMs which is our current best system.",
booktitle = "Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications",
doi = "10.1007/978-3-642-41827-3_10",
editor = "Jos{\'e} Ruiz-Shulcloper and Gabriella Sanniti di Baja",
isbn = "9783642418266",
keywords = "machine learning, nilm, appliance identification, load monitoring",
pages = "75--82",
publisher = "Springer",
series = "Lecture Notes in Computer Science",
title = "{U}nseen {A}ppliances {I}dentification",
volume = "8259",
year = "2013",
}
• A. Ridi, C. Gisler, and J. Hennebert, "Le machine learning: un atout pour une meilleure efficacité - applications à la gestion énergétique des bâtiments," Bulletin Electrosuisse, iss. 10s, pp. 21-24, 2013.
[Bibtex]
@article{ridi2013:electrosuisse,
author = "Antonio Ridi and Christophe Gisler and Jean Hennebert",
abstract = "Comment g{\'e}rer de mani{\e}re intelligente les consommations et productions d’{\'e}nergie dans les b{\^a}timents ? Les solutions {\a} ce probl{\e}me complexe pourraient venir du monde de l’apprentissage automatique ou «machine learning». Celui-ci permet la mise au point d’algorithmes de contr{\^o}le avanc{\'e}s visant simultan{\'e}ment la r{\'e}duction de la consommation d’{\'e}nergie, l’am{\'e}lioration du confort de l’utilisateur et l’adaptation {\a} ses besoins.",
issn = "1660-6728",
journal = "Bulletin Electrosuisse",
keywords = "Machine Learning, Energy Efficiency, Smart Buildings",
number = "10s",
pages = "21-24",
title = "{L}e machine learning: un atout pour une meilleure efficacit{\'e} - applications {\a} la gestion {\'e}nerg{\'e}tique des b{\^a}timents",
year = "2013",
}
• A. Ridi, C. Gisler, and J. Hennebert, "Automatic Identification of Electrical Appliances Using Smart Plugs," in WOSSPA2013 The 9th International Workshop on Systems, Signal Processing and their Applications 2013, Algeria, 2013, pp. 301-305.
[Bibtex]
@conference{ridi:2013:wosspa,
author = "Antonio Ridi and Christophe Gisler and Jean Hennebert",
abstract = "We report on the evaluation of signal processing and classification algorithms to automatically recognize electric appliances. The system is based on low-cost smart-plugs measuring periodically the electricity values and producing time series of measurements that are specific to the appliance consumptions. In a similar way as for biometric applications, such electric signatures can be used to identify the type of appliance in use. In this paper, we propose to use dynamic features based on time derivative and time second derivative features and we compare different classification algorithms including K-Nearest Neighbor and Gaussian Mixture Models. We use the recently recorded electric signature database ACS-F1 and its intersession protocol to evaluate our algorithm propositions. The best combination of features and classifiers shows 93.6% accuracy.",
booktitle = "WOSSPA2013 The 9th International Workshop on Systems, Signal Processing and their Applications 2013",
doi = "10.1109/WoSSPA.2013.6602380",
isbn = "9781467355407",
keywords = "machine learning, electric consumption analysis, GMM, HMM",
month = "May",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
pages = "301-305",
publisher = "IEEE",
title = "{A}utomatic {I}dentification of {E}lectrical {A}ppliances {U}sing {S}mart {P}lugs",
year = "2013",
}
• F. Slimane, S. Kanoun, J. Hennebert, A. M. Alimi, and R. Ingold, "A Study on Font-Family and Font-Size Recognition Applied to Arabic Word Images at Ultra-Low Resolution," Pattern recognition Letters (PRL), vol. 34, iss. 2, pp. 209-218, 2013.
[Bibtex]
@article{fouad2013:prl,
author = "Fouad Slimane and Slim Kanoun and Jean Hennebert and Adel M. Alimi and Rolf Ingold",
abstract = "In this paper, we propose a new font and size identification method for ultra-low resolution Arabic word images using a stochastic approach. The literature has proved the difficulty for Arabic text recognition systems to treat multi-font and multi-size word images. This is due to the variability induced by some font family, in addition to the inherent difficulties of Arabic writing including cursive representation, overlaps and ligatures. This research work proposes an efficient stochastic approach to tackle the problem of font and size recognition. Our method treats a word image with a fixed-length, overlapping sliding window. Each window is represented with a 102 features whose distribution is captured by Gaussian Mixture Models (GMMs). We present three systems: (1) a font recognition system, (2) a size recognition system and (3) a font and size recognition system. We demonstrate the importance of font identification before recognizing the word images with two multi-font Arabic OCRs (cascading and global). The cascading system is about 23% better than the global multi-font system in terms of word recognition rate on the Arabic Printed Text Image (APTI) database which is freely available to the scientific community.",
doi = "/10.1016/j.patrec.2012.09.012",
issn = "0167-8655",
journal = "Pattern recognition Letters (PRL)",
keywords = "Font and size recognition, GMM, HMM, Arabic OCR, Sliding window, Ultra-low resolution, Machine Learning",
month = "January",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = "2",
pages = "209-218",
title = "{A} {S}tudy on {F}ont-{F}amily and {F}ont-{S}ize {R}ecognition {A}pplied to {A}rabic {W}ord {I}mages at {U}ltra-{L}ow {R}esolution",
volume = "34",
year = "2013",
}
• F. Slimane, S. Kanoun, H. E. Abed, A. M. Alimi, R. Ingold, and J. Hennebert, "ICDAR2013 Competition on Multi-font and Multi-size Digitally Represented Arabic Text," in Document Analysis and Recognition (ICDAR), 2013 12th International Conference on, 2013, pp. 1433-1437.
[Bibtex]
@conference{slimane2013:icdar,
author = "Fouad Slimane and Slim Kanoun and Haikal El Abed and Adel M. Alimi and Rolf Ingold and Jean Hennebert",
abstract = "This paper describes the Arabic Recognition Competition: Multi-font Multi-size Digitally Represented Text held in the context of the 12th International Conference on Document Analysis and Recognition (ICDAR'2013), during August 25-28, 2013, Washington DC, United States of America. This competition has used the freely available Arabic Printed Text Image (APTI) database. A first edition took place in ICDAR'2011. In this edition, four groups with six systems are participating in the competition. The systems are compared using the recognition rates at character and word levels. The systems were tested in a blind manner using set 6 of APTI database. A short description of the participating groups, their systems, the experimental setup, and the observed results are presented.",
booktitle = "Document Analysis and Recognition (ICDAR), 2013 12th International Conference on",
doi = "10.1109/ICDAR.2013.289",
isbn = "9781479901937",
issn = "1520-5363",
keywords = "Character recognition, Databases, Feature extraction, Hidden Markov models, Image recognition, Protocols, Text recognition, APTI Database, Arabic Text, Competition, OCR System, Ultra-Low Resolution",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
pages = "1433-1437",
title = "{ICDAR}2013 {C}ompetition on {M}ulti-font and {M}ulti-size {D}igitally {R}epresented {A}rabic {T}ext",
year = "2013",
}
• N. Sokhn, R. Baltensperger, L. Bersier, J. Hennebert, and U. Ultes-Nitsche, "Identification of Chordless Cycles in Ecological Networks," in Complex Sciences, 2013, pp. 316-324.
[Bibtex]
@conference{sokhn2013:complex,
author = "Nayla Sokhn and Richard Baltensperger and Louis-Felix Bersier and Jean Hennebert and Ulrich Ultes-Nitsche",
abstract = "Abstract: In the last few years the studies on complex networks have gained extensive research interests. Significant impacts are made by these studies on a wide range of different areas including social networks, tech- nology networks, biological networks and others. Motivated by under- standing the structure of ecological networks we introduce in this paper a new algorithm for enumerating all chordless cycles. The proposed al- gorithm is a recursive one based on the depth-first search.
Keywords: ecological networks, community structure, food webs, niche- overlap graphs, chordless cycles.",
booktitle = "Complex Sciences",
doi = "10.1007/978-3-319-03473-7_28",
isbn = "978-3-319-03472-0",
keywords = "graph theory, ecological networks, food webs, algorithmics, complex systems",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
pages = "316-324",
publisher = "Springer International Publishing",
series = "Second International Conference, COMPLEX 2012, Santa Fe, NM, USA, December 5-7, 2012, Revised Selected Papers",
title = "{I}dentification of {C}hordless {C}ycles in {E}cological {N}etworks",
volume = "126",
year = "2013",
}
• N. Sokhn, R. Baltensperger, J. Hennebert, U. Ultes-Nitsche, and L. Bersier, "Structure analysis of niche-overlap graphs," in NetSci 2013 - International School and Conference on Network Science, 2013.
[Bibtex]
@conference{sokhn2013:netsci,
author = "Nayla Sokhn and Richard Baltensperger and Jean Hennebert and Ulrich Ultes-Nitsche and Louis-Flix Bersier",
abstract = "The joint analysis of the structure and dynamics of complex networks has been recently a common interest for many researchers. In this study, we focus on the structure of ecological networks, specifically on niche-overlap graphs. In these networks, two species are connected if they share at least one prey, and thus represent competition graphs. The aim of this work is to reveal if these graphs show small-world/scale free properties. To answer this question, we select a set of 14 niche-overlap graphs from highly resolved food-webs, and study in the first part their clustering coeficient and diameter.",
booktitle = "NetSci 2013 - International School and Conference on Network Science",
keywords = "Graph structure analysis, Complex systems, Biology, Ecological Networks, Niche-overlap graphs",
title = "{S}tructure analysis of niche-overlap graphs",
year = "2013",
}
• N. Sokhn, R. Baltensperger, L. Bersier, U. Ultes-Nitsche, and J. Hennebert, "Structural Network Properties of Niche-Overlap Graphs," in International Conference on Signal-Image Technology & Internet-Based Systems (SITIS 2013), 2013, pp. 478-482.
[Bibtex]
@conference{sokhn2013:sitis,
author = "Nayla Sokhn and Richard Baltensperger and Louis-F{\'e}lix Bersier and Ulrich Ultes-Nitsche and Jean Hennebert",
abstract = "The structure of networks has always been interesting for researchers. Investigating their unique architecture allows to capture insights and to understand the function and evolution of these complex systems. Ecological networks such as food-webs and niche-overlap graphs are considered as complex systems. The main purpose of this work is to compare the topology of 15 real niche-overlap graphs with random ones. Five measures are treated in this study: (1) the clustering coefficient, (2) the between ness centrality, (3) the assortativity coefficient, (4) the modularity and (5) the number of chord less cycles. Significant differences between real and random networks are observed. Firstly, we show that niche-overlap graphs display a higher clustering and a higher modularity compared to random networks. Moreover we find that random networks have barely nodes that belong to a unique sub graph (i.e. between ness centrality equal to 0) and highlight the presence of a small number of chord less cycles compared to real networks. These analyses may provide new insights in the structure of these real niche-overlap graphs and may give important implications on the functional organization of species competing for some resources and on the dynamics of these systems.",
booktitle = "International Conference on Signal-Image Technology {\&} Internet-Based Systems (SITIS 2013)",
doi = "10.1109/SITIS.2013.83",
editor = "IEEE",
isbn = "9781479932115",
keywords = "Food-webs, Niche-Overlap Graphs, Structure of Networks, Clustering Coefficient, Betweenness Centrality, Assortativity, Modularity,Chordless Cycles",
pages = "478-482",
title = "{S}tructural {N}etwork {P}roperties of {N}iche-{O}verlap {G}raphs",
year = "2013",
}
• G. Bovet and J. Hennebert, "Le Web des objets à la conquête des bâtiments intelligents," Bulletin Electrossuisse, vol. 10s, pp. 15-18, 2012.
[Bibtex]
@article{gerome2012:electrosuisse,
author = "G{\'e}r{\^o}me Bovet and Jean Hennebert",
abstract = "L’am{\'e}lioration de l’efficacit{\'e} {\'e}nerg{\'e}tique des b{\^a}timents n{\'e}cessite des syst{\e}mes automatiques de plus en plus sophistiqu{\'e}s pour optimiser le rapport entre les {\'e}cono- mies d’{\'e}nergie et le confort des usagers. La gestion conjointe du chauffage, de l’{\'e}clairage ou encore de la production locale d’{\'e}nergie est effectu{\'e}e via de v{\'e}ri- tables syst{\e}mes d’information reposant sur une multi- tude de capteurs et d’actionneurs interconnect{\'e}s. Cette complexit{\'e} croissante exige une {\'e}volution des r{\'e}seaux de communication des b{\^a}timents.",
issn = "1660-6728",
journal = "Bulletin Electrossuisse",
keywords = "wot, iot, green-it, it-for-green",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
pages = "15-18",
title = "{L}e {W}eb des objets {\a} la conqu{\^e}te des b{\^a}timents intelligents",
volume = "10s",
year = "2012",
}
• G. Bovet and J. Hennebert, "Communicating With Things - An Energy Consumption Analysis," in Pervasive, Newcastle, UK, 2012, pp. 1-4.
[Bibtex]
@conference{bove12:pervasive,
author = "G{\'e}r{\^o}me Bovet and Jean Hennebert",
abstract = "In this work we report on the analysis, from an energy consumption point of view, of two communication methods in the Web-of-Things (WoT) framework. The use of WoT is seducing regarding the standardization of the access to things. It also allows leveraging on existing web application frameworks and speed up development. However, in some contexts such as smart buildings where the objective is to control the equipments to save energy, the underlying WoT framework including hardware, communication and APIs must itself be energy efficient. More specifically, the WoT proposes to use HTTP callbacks or WebSockets based on TCP for exchanging data. In this paper we introduce both methods and then analyze their power consumption in a test environment. We also discuss what future research can be conducted from our preliminary findings.",
booktitle = "Pervasive",
keywords = "web-of-things; smart building; RESTful services; green-computing",
month = "June",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
pages = "1-4",
title = "{C}ommunicating {W}ith {T}hings - {A}n {E}nergy {C}onsumption {A}nalysis",
year = "2012",
}
• C. Gisler, G. Barchi, G. Bovet, E. Mugellini, and J. Hennebert, "Demonstration Of A Monitoring Lamp To Visualize The Energy Consumption In Houses," in The 10th International Conference on Pervasive Computing (Pervasive2012), Newcastle, 2012.
[Bibtex]
@conference{gisl12:pervasive,
author = "Christophe Gisler and Grazia Barchi and G{\'e}r{\^o}me Bovet and Elena Mugellini and Jean Hennebert",
abstract = "We report on the development of a wireless lamp dedicated to the feedback of energy consumption. The principle is to provide a simple and intuitive feedback to residents through color variations of the lamp depending on the amount of energy consumed in a house. Our system is demonstrated on the basis of inexpensive components piloted by a gateway storing and processing the energy data in a WoT framework. Different versions of the color choosing algorithm are also presented.",
booktitle = "The 10th International Conference on Pervasive Computing (Pervasive2012)",
keywords = "Web of Things; Energy feedback; Green Computing; IT-for-Green",
month = "jun",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
title = "{D}emonstration {O}f {A} {M}onitoring {L}amp {T}o {V}isualize {T}he {E}nergy {C}onsumption {I}n {H}ouses",
year = "2012",
}
• J. Hennebert, A. Schmoutz, S. Baudin, L. Zambon, and A. Delley, "Le projet ePark - Solutions technologiques pour la gestion des véhicules électriques et de leur charge," Electro Suisse Bulletin SEV/AES, vol. 4, pp. 34-36, 2012.
[Bibtex]
@article{henn12:electrosuisse,
author = "Jean Hennebert and Alain Schmoutz and S{\'e}bastien Baudin and Loic Zambon and Antoine Delley",
abstract = "le projet ePark vise {\a} amener sur le march{\'e} une solution technologique globale et ouverte pour la gestion des v{\'e}hicules {\'e}lectriques et de leur charge. Il comprend l’{\'e}laboration d’un mod{\e}le de march{\'e}, ainsi que la r{\'e}alisation d’une borne de charge low-cost et d’un syst{\e}me d’information {\'e}volutif. La solution inclura des services de gestion de flottes et de parkings, de planification de la charge, d’authentification des usagers et de facturation, qui seront accessibles via des interfaces Web ou des clients mobiles de type smartphone.",
issn = "1660-6728",
journal = "Electro Suisse Bulletin SEV/AES",
keywords = "Sustainable ICT, Green Computing, EV, IT for Efficiency",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
pages = "34-36",
title = "{L}e projet e{P}ark - {S}olutions technologiques pour la gestion des v{\'e}hicules {\'e}lectriques et de leur charge",
volume = "4",
year = "2012",
}
• F. Slimane, S. Kanoun, J. Hennebert, R. Ingold, and A. M. Alimi, "A New Baseline Estimation Method Applied to Arabic Word Recognition," in 10th IAPR International Workshop on Document Analysis Systems (DAS 2012), Goldquest, Queensland, 2012.
[Bibtex]
@conference{fouad2012:das,
author = "Fouad Slimane and Slim Kanoun and Jean Hennebert and Rolf Ingold and Adel M. Alimi",
booktitle = "10th IAPR International Workshop on Document Analysis Systems (DAS 2012)",
month = "March",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
title = "{A} {N}ew {B}aseline {E}stimation {M}ethod {A}pplied to {A}rabic {W}ord {R}ecognition",
Pdf = "http://www.ict.griffith.edu.au/das2012/attachments/ShortPaperProceedings/S10.pdf",
year = "2012",
}
• F. Slimane, O. Zayene, S. Kanoun, A. M. Alimi, J. Hennebert, and R. Ingold, "New Features for Complex Arabic Fonts in Cascading Recognition System," in Proc. of 21th International Conference on Pattern Recognition (ICPR 2012), Tsukuba, Japan, 2012, pp. 738-741.
[Bibtex]
@conference{fouad2012:icpr,
author = "Fouad Slimane and Oussema Zayene and Slim Kanoun and Adel M. Alimi and Jean Hennebert and Rolf Ingold",
abstract = "We propose in this work an approach for automatic recognition of printed Arabic text in open vocabulary mode and ultra low resolution (72 dpi). This system is based on Hidden Markov Models using the HTK toolkit. The novelty of our work is in the analysis of three complex fonts presenting strong ligatures: DiwaniLetter, DecoTypeNaskh and DecoTypeThuluth. We propose a feature extraction based on statistical and structural primitives allowing a robust description of the different morphological variability of the considered fonts. The system is benchmarked on the Arabic Printed Text Image (APTI) database.",
booktitle = "Proc. of 21th International Conference on Pattern Recognition (ICPR 2012)",
isbn = "978-1-4673-2216-4",
issn = "1051-4651",
keywords = "Character and Text Recognition, Handwriting Recognition, Performance Evaluation, Machine Learning",
month = "November",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
pages = "738-741",
publisher = "IEEE",
title = "{N}ew {F}eatures for {C}omplex {A}rabic {F}onts in {C}ascading {R}ecognition {S}ystem",
year = "2012",
}
• F. Slimane, S. Kanoun, J. Hennebert, R. Ingold, and A. M. Alimi, "Benchmarking Strategy for Arabic Screen-Rendered Word Recognition," in Guide to OCR for Arabic Scripts, V. Märgner and H. El Abed, Eds., Springer London, 2012, pp. 423-450.
[Bibtex]
@inbook{slim12:guideocr,
author = "Fouad Slimane and Slim Kanoun and Jean Hennebert and Rolf Ingold and Adel M. Alimi",
abstract = "This chapter presents a new benchmarking strategy for Arabic screen- based word recognition. Firstly, we report on the creation of the new APTI (Arabic Printed Text Image) database. This database is a large-scale benchmarking of open-vocabulary, multi-font, multi-size and multi-style word recognition systems in Arabic. Such systems take as input a text image and compute as output a character string corresponding to the text included in the image. The challenges that are addressed by the database are in the variability of the sizes, fonts and styles used to generate the images. A focus is also given on low resolution images where anti-aliasing is generating noise on the characters being recognized. The database contains 45,313,600 single word images totalling more than 250 million characters. Ground truth annotation is provided for each image from an XML file. The annotation includes the number of characters, the number of pieces of Arabic words (PAWs), the sequence of characters, the size, the style, the font used to generate each image, etc. Secondly, we describe the Arabic Recognition Competition: Multi-Font Multi-Size Digitally Represented Text held in the context of the 11th International Conference on Document Analysis and Recognition (ICDAR’2011), during September 18–21, 2011, Beijing, China. This first edition of the competition used the freely available APTI database. Two groups with three systems participated in the competition. The systems were compared using the recognition rates at the character and word levels. The systems were tested on one test dataset which is unknown to all participants (set 6 of APTI database). The systems were compared on the ground of the most important characteristic of classification systems: the recognition rate. A short description of the participating groups, their systems, the experimental setup and the observed results are presented. Thirdly, we present our DIVA-REGIM system (out of competition at ICDAR’2011) with all results of the Arabic recognition competition protocols.",
booktitle = "Guide to OCR for Arabic Scripts",
doi = "10.1007/978-1-4471-4072-6_18",
editor = "M{\"a}rgner, Volker and El Abed, Haikal",
isbn = "978-1-4471-4071-9",
keywords = "arabic, ocr, recognition, database, benchmarking",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
pages = "423-450",
publisher = "Springer London",
series = "Guide to OCR for Arabic Scripts",
title = "{B}enchmarking {S}trategy for {A}rabic {S}creen-{R}endered {W}ord {R}ecognition",
year = "2012",
}
• D. Zufferey, C. Gisler, O. A. Khaled, and J. Hennebert, "Machine Learning Approaches for Electric Appliance Classification," in The 11th International Conference on Information Sciences, Signal Processing and their Applications: Main Tracks (ISSPA2012 - Tracks), Montreal, Canada, Canada, 2012, pp. 740-745.
[Bibtex]
@conference{zuff12:isspa,
author = "Damien Zufferey and Christophe Gisler and Omar Abou Khaled and Jean Hennebert",
abstract = "We report on the development of an innovative system which can automatically recognize home appliances based on their electric consumption profiles. The purpose of our system is to apply adequate rules to control electric appliance in order to save energy and money. The novelty of our approach is in the use of plug-based low-end sensors that measure the electric consumption at low frequency, typically every 10 seconds. Another novelty is the use of machine learning approaches to perform the classification of the appliances. In this paper, we present the system architecture, the data acquisition protocol and the evaluation framework. More details are also given on the feature extraction and classification models being used. The evaluation showed promising results with a correct rate of identification of 85\%.",
booktitle = "The 11th International Conference on Information Sciences, Signal Processing and their Applications: Main Tracks (ISSPA2012 - Tracks)",
doi = "10.1109/ISSPA.2012.6310651",
editor = "IEEE",
isbn = "978-1-4673-0381-1",
keywords = "Signal processing, machine learning algorithms, power system analysis computing, energy consumption, energy efficiency, sustainable development, green-computing, it-for-green",
month = "jul",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
pages = "740-745",
title = "{M}achine {L}earning {A}pproaches for {E}lectric {A}ppliance {C}lassification",
year = "2012",
}
• H. Gaddour, H. Guesmi, F. Slimane, S. Kanoun, and J. Hennebert, "A New Method for Ranking of Word Hypotheses generated from OCR: The Application on the Arabic Word Recognition," in Proceedings of The Twelfth IAPR Conference on Machine Vision Applications (MVA 2011), Nara (Japan), 2011.
[Bibtex]
@conference{gadd11:mva,
author = "Houda Gaddour and Han{\e}ne Guesmi and Fouad Slimane and Slim Kanoun and Jean Hennebert",
abstract = "In this paper, we propose a new method for the best ranking of OCR word hypotheses in order to increase the chances that the correct hypothesis will be ranked in the first position. This method is based on the images construction of the OCR word hypotheses and the calculation of the dissimilarity scores between these last constructed images and the image to recognize. To evaluate the new proposed method, we compare them with a classic method which is based on the ranking of OCR word hypotheses under the recognition process. The experimental results of these two methods on the database of 1000 word images show that the new proposed method led to the best ranking of OCR word hypotheses.",
booktitle = "Proceedings of The Twelfth IAPR Conference on Machine Vision Applications (MVA 2011)",
keywords = "arabic, HMM, image processing, machine learning, OCR",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
title = "{A} {N}ew {M}ethod for {R}anking of {W}ord {H}ypotheses generated from {OCR}: {T}he {A}pplication on the {A}rabic {W}ord {R}ecognition",
year = "2011",
}
• J. Ortega-Garcia, J. Fierrez, F. Alonso-Fernandez, J. Galbally, M. R. Freire, J. Gonzalez-Rodriguez, C. Garcia-Mateo, J. Alba-Castro, E. Gonzalez-Agulla, E. Otero-Muras, S. Garcia-Salicetti, L. Allano, B. Ly-Van, B. Dorizzi, J. Kittler, T. Bourlai, N. Poh, F. Deravi, M. W. R. Ng, M. Fairhurst, J. Hennebert, A. Humm, M. Tistarelli, L. Brodo, J. Richiardi, A. Drygajlo, H. Ganster, F. M. Sukno, S. Pavani, A. Frangi, L. Akarun, and A. Savran, "The Multiscenario Multienvironment BioSecure Multimodal Database (BMDB)," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, pp. 1097-1111, 2010.
[Bibtex]
@article{ortega10:tpami,
author = "Javier Ortega-Garcia and Julian Fierrez and Fernando Alonso-Fernandez and Javier Galbally and Manuel R. Freire and Joaquin Gonzalez-Rodriguez and Carmen Garcia-Mateo and Jose-Luis Alba-Castro and Elisardo Gonzalez-Agulla and Enrique Otero-Muras and Sonia Garcia-Salicetti and Lorene Allano and Bao Ly-Van and Bernadette Dorizzi and Josef Kittler and Thirimachos Bourlai and Norman Poh and Farzin Deravi and Ming W. R. Ng and Michael Fairhurst and Jean Hennebert and Andreas Humm and Massimo Tistarelli and Linda Brodo and Jonas Richiardi and Andrzej Drygajlo and Harald Ganster and Federico M. Sukno and Sri-Kaushik Pavani and Alejandro Frangi and Lale Akarun and Arman Savran",
abstract = "A new multimodal biometric database designed and acquired within the framework of the European BioSecure Network of Excellence (NoE) is presented. It comprises more than 600 individuals acquired simultaneously in three scenarios: i) over the Internet, ii) in an office environment with desktop PC, and iii) in indoor/outdoor environments with mobile portable hardware. Data has been acquired over two acquisition sessions and using different sensors in certain modalities. The three scenarios include a common part of audio and video data (face still images and talking face videos). Also, signature and fingerprint data has been acquired both with desktop PC and mobile portable hardware. Additionally, hand and iris data was acquired in the second scenario using desktop PC. Acquisition has been conducted by 11 European institutions taking part in the BioSecure NoE. Additional features of the BioSecure Multimodal Database (BMDB) are: balanced gender and age distributions, multimodal realistic scenarios with simple and quick tasks per modality, cross-European diversity (language, face, etc.), availability of demographic data (age, gender, handedness, visual aids, manual worker and English proficiency) and compatibility with other multimodal databases. The novel acquisition conditions of the BMDB database allow to perform new challenging research and evaluation of either monomodal or multimodal biometric systems, as in the recent BioSecure Multimodal Evaluation Campaign. A description of this campaign including baseline results of individual modalities from the new database is also given. The database is expected to be available for research purposes through the BioSecure Association during 2008.",
crossref = " ",
doi = "10.1109/TPAMI.2009.76",
issn = "0162-8828",
journal = "IEEE Transactions on Pattern Analysis and Machine Intelligence",
keywords = "Benchmarking, Biometrics, machine learning",
month = " ",
note = "http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4815263
Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
pages = "1097-1111",
title = "{T}he {M}ultiscenario {M}ultienvironment {B}io{S}ecure {M}ultimodal {D}atabase ({BMDB})",
volume = "32",
year = "2010",
}
• F. Slimane, S. Kanoun, H. E. Abed, A. Alimi, R. Ingold, and J. Hennebert, "ICDAR 2011 - Arabic Recognition Competition: Multi-font Multi-size Digitally Represented Text," in Document Analysis and Recognition (ICDAR), 2011 International Conference on, 2011, pp. 1449-1453.
[Bibtex]
@conference{fouad10:icdar,
author = "Fouad Slimane and Slim Kanoun and Haikal El Abed and Adel Alimi and Rolf Ingold and Jean Hennebert",
abstract = "This paper describes the Arabic Recognition Competition: Multi-font Multi-size Digitally Represented Text held in the context of the 11th International Conference on Document Analysis and Recognition (ICDAR2011), during September 18-21, 2011, Beijing, China. This first competition used the freely available Arabic Printed Text Image (APTI) database. Several research groups have started using the APTI database and this year, 2 groups with 3 systems are participating in the competition. The systems are compared using the recognition rates at the character and word levels. The systems were tested on one test dataset which is unknown to all participants (set 6 of APTI database). The systems are compared on the most important characteristic of classification systems, the recognition rate. A short description of the participating groups, their systems, the experimental setup, and the observed results are presented.",
booktitle = "Document Analysis and Recognition (ICDAR), 2011 International Conference on",
doi = "10.1109/ICDAR.2011.288",
isbn = "9781457713507",
issn = "1520-5363",
keywords = "APTI database;Arabic printed text image database;Arabic recognition competition;ICDAR2011;character recognition;classification system;document analysis;document recognition;multifont multisize digitally text representation;character recognition;document i",
month = "sept.",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
pages = "1449 -1453",
title = "{ICDAR} 2011 - {A}rabic {R}ecognition {C}ompetition: {M}ulti-font {M}ulti-size {D}igitally {R}epresented {T}ext",
year = "2011",
}
• F. Slimane, R. Ingold, S. Kanoun, A. Alimi, and J. Hennebert, "Impact of Character Models Choice on Arabic Text Recognition Performance," in 12th International Conference on Frontiers in Handwriting Recognition, ICFHR 2010, 2010, pp. 670-675.
[Bibtex]
@conference{fouad10:icfhr,
author = "Fouad Slimane and Rolf Ingold and Slim Kanoun and Adel Alimi and Jean Hennebert",
abstract = "We analyze in this paper the impact of sub-models choice for automatic Arabic printed text recognition based on Hidden Markov Models (HMM). In our approach, sub-models correspond to characters shapes assembled to compose words models. One of the peculiarities of Arabic writing is to present various character shapes according to their position in the word. With 28 basic characters, there are over 120 different shapes. Ideally, there should be one sub-model for each different shape. However, some shapes are less frequent than others and, as training databases are finite, the learning process leads to less reliable models for the infrequent shapes. We show in this paper that an optimal set of models has then to be found looking for the trade-off between having more models capturing the intricacies of shapes and grouping the models of similar shapes with other. We propose in this paper different sets of sub-models that have been evaluated using the Arabic Printed Text Image (APTI) Database freely available for the scientific community.",
booktitle = "12th International Conference on Frontiers in Handwriting Recognition, ICFHR 2010",
crossref = " ",
doi = "10.1109/ICFHR.2010.110",
editor = " ",
isbn = "9781424483532",
keywords = "arabic, HMM, machine learning, OCR",
month = "nov",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "670-675",
publisher = " ",
series = " ",
title = "{I}mpact of {C}haracter {M}odels {C}hoice on {A}rabic {T}ext {R}ecognition {P}erformance",
volume = " ",
year = "2010",
}
• J. Hennebert, J. Rey, and Y. Bocchi, Les méthodes agiles en action, 2010.
[Bibtex]
@misc{rey2010:nouvelliste,
author = "Jean Hennebert and Jean-Pierre Rey and Yann Bocchi",
abstract = "L’{\'e}volution rapide de notre monde met en lumi{\e}re les limites d’une gestion de projet traditionnelle. La solution de la HES-SO. Les m{\'e}thodes classiques de gestion de projet comprennent g{\'e}n{\'e}ralement les phases de sp{\'e}cification compl{\e}te du produit {\a} d{\'e}velopper, d’estimation et de planification, de mod{\'e}lisation, de r{\'e}alisation, de tests, de d{\'e}ploiement et de maintenance. Elles sont de moins en moins bien adapt{\'e}es aux nombreux changements qui ne manquent pas de se produire en cours de projet: facteurs externes (concurrence, nouvelles technologies {\'e}mergentes, nouveaux besoins, etc.) ou internes (changement d’organisation, ressources cl{\'e}s qui quittent l’entreprise, etc.), complexit{\'e} sous-{\'e}valu{\'e}e, attentes et besoins des clients qui {\'e}voluent au cours du temps, etc.
Ces changements impliquent une m{\'e}thode de travail plus souple et plus r{\'e}active afin qu’un projet se termine {\a} satisfaction pour toutes les parties engag{\'e}es. Le principe fondamental de l’agilit{\'e} est en effet de consid{\'e}rer le changement non plus comme une source de probl{\e}mes mais comme un param{\e}tre inh{\'e}rent {\a} tous les projets. Le changement devient ainsi partie prenante du projet, qui s’organise et se rythme en fonction de celui-ci.",
howpublished = "Le Nouvelliste",
keywords = "Agile, IT project management, SCRUM",
month = "oct#{27th}",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
title = "{L}es m{\'e}thodes agiles en action",
year = "2010",
}
• G. Rudaz, J. Hennebert, and H. Müller, "A 3D Object Retrieval System Using Text and Simple Visual Information," HES-SO, Technical Report , 2010.
[Bibtex]
@techreport{rudaz2010:3dobject,
author = "Gilles Rudaz and Jean Hennebert and Henning M\"uller",
abstract = "3D objects are being increasingly produced and used by a broad public. Tools are thus required to manage collections of 3D objects in a similar way to the management of image collections including the possibility to search using text queries. The system we are presenting in this paper goes one step further allowing to search for 3D objects not only using text queries but also using very simple similarity metrics inferred from the 3D spatial description. The multimodal search is stepwise. First, the user searches for a set of relevant objects using a classical text–based query engine. Second, the user selects a subset of the returned objects to perform a new query in the database according to a relevance feedback method. The relevance is computed using different geometric criteria that can be activated or deactivated. External objects can also be submitted directly for a similarity query. The current visual features are very simple but are planned to be extended in the future.",
institution = "HES-SO",
keywords = "3D information retrieval, visual information retrieval",
month = "jul",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
title = "{A} 3{D} {O}bject {R}etrieval {S}ystem {U}sing {T}ext and {S}imple {V}isual {I}nformation",
type = "Technical Report",
year = "2010",
}
• F. Slimane, S. Kanoun, A. Alimi, J. Hennebert, and R. Ingold, "Comparison of Global and Cascading Recognition Systems Applied to Multi-font Arabic Text," in 10th ACM Symposium on Document Engineering, DocEng'10, 2010, pp. 161-164.
[Bibtex]
@conference{fouad10:doceng,
author = "Fouad Slimane and Slim Kanoun and Adel Alimi and Jean Hennebert and Rolf Ingold",
abstract = "A known difficulty of Arabic text recognition is in the large variability of printed representation from one font to the other. In this paper, we present a comparative study be- tween two strategies for the recognition of multi-font Arabic text. The first strategy is to use a global recognition system working independently on all the fonts. The second strategy is to use a so-called cascade built from a font identification system followed by font-dependent systems. In order to reach a fair comparison, the feature extraction and the modeling algorithms based on HMMs are kept as similar as possible between both approaches. The evaluation is carried out on the large and publicly available APTI (Arabic Printed Text Image) database with 10 different fonts. The results are showing a clear advantage of performance for the cascading approach. However, the cascading system is more costly in terms of cpu and memory.",
booktitle = "10th ACM Symposium on Document Engineering, DocEng'10",
crossref = " ",
doi = "10.1145/1860559.1860591",
editor = " ",
isbn = "9781450302319",
keywords = "arabic, HMM, image processing, machine learning, OCR",
month = "sep",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "161-164",
publisher = " ",
series = "DocEng '10",
title = "{C}omparison of {G}lobal and {C}ascading {R}ecognition {S}ystems {A}pplied to {M}ulti-font {A}rabic {T}ext",
volume = " ",
year = "2010",
}
• F. Slimane, S. Kanoun, A. Alimi, R. Ingold, and J. Hennebert, "Gaussian Mixture Models for Arabic Font Recognition," in 20th International Conference on Pattern Recognition, ICPR 2010, Istanbul (Turkey), 2010, pp. 2174-2177.
[Bibtex]
@conference{fouad10:icpr,
author = "Fouad Slimane and Slim Kanoun and Adel Alimi and Rolf Ingold and Jean Hennebert",
abstract = "We present in this paper a new approach for Arabic font recognition. Our proposal is to use a fixed- length sliding window for the feature extraction and to model feature distributions with Gaussian Mixture Models (GMMs). This approach presents a double advantage. First, we do not need to perform a priori segmentation into characters, which is a difficult task for arabic text. Second, we use versatile and powerful GMMs able to model finely distributions of features in large multi-dimensional input spaces. We report on the evaluation of our system on the APTI (Arabic Printed Text Image) database using 10 different fonts and 10 font sizes. Considering the variability of the different font shapes and the fact that our system is independent of the font size, the obtained results are convincing and compare well with competing systems.",
booktitle = "20th International Conference on Pattern Recognition, ICPR 2010",
crossref = " ",
doi = "10.1109/ICPR.2010.532",
editor = " ",
isbn = "9781424475421",
keywords = "arabic, GMM, machine learning, OCR",
month = "aug",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "2174-2177",
publisher = " ",
series = " ",
title = "{G}aussian {M}ixture {M}odels for {A}rabic {F}ont {R}ecognition",
volume = " ",
year = "2010",
}
• F. Verdet, D. Matrouf, J. Bonastre, and J. Hennebert, "Channel Detectors for System Fusion in the Context of NIST LRE 2009," in 11th Annual Conference of the International Speech Communication Association, Interspeech 2010, 2010, pp. 733-736.
[Bibtex]
@conference{verdet10:interspeech,
author = "Florian Verdet and Driss Matrouf and Jean-Fran{\c{c}}ois Bonastre and Jean Hennebert",
abstract = "One of the difficulties in Language Recognition is the variability of the speech signal due to speakers and channels. If channel mismatch is too big and when different categories of channels can be identified, one possibility is to build a separate language recognition system for each category and then to fuse them together. This article uses a system selector that takes, for each utterance, the scores of one of the channel-category dependent systems. This selection is guided by a channel detector. We analyze different ways to design such channel detectors: based on cepstral features or on the Factor Analysis channel variability term. The systems are evaluated in the context of NIST’s LRE 2009 and run at 1.65\% minCavg for a subset of 8 languages and at 3.85\% minCavg for the 23 language setup.",
booktitle = "11th Annual Conference of the International Speech Communication Association, Interspeech 2010",
crossref = " ",
editor = " ",
keywords = "channel, channel category, channel detector, factor analysis, fusion, Language Identification, machine learning",
month = "sep",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = " 733-736",
publisher = " ",
series = " ",
title = "{C}hannel {D}etectors for {S}ystem {F}usion in the {C}ontext of {NIST} {LRE} 2009",
volume = " ",
year = "2010",
}
• F. Verdet, D. Matrouf, J. Bonastre, and J. Hennebert, "Coping with Two Different Transmission Channels in Language Recognition," in Odyssey 2010, The Speaker and Language Recognition Workshop, 2010, pp. 230-237.
[Bibtex]
@conference{verdet10:odyssey,
author = "Florian Verdet and Driss Matrouf and Jean-Fran{\c{c}}ois Bonastre and Jean Hennebert",
abstract = "This paper confirms the huge benefits of Factor Analysis over Maximum A-Posteriori adaptation for language recognition (up to 87\% relative gain). We investigate ways to cope with the particularity of NIST’s LRE 2009, containing Conversational Telephone Speech (CTS) and phone bandwidth segments of radio broadcasts (Voice Of America, VOA). We analyze GMM systems using all data pooled together, eigensession matrices estimated on a per condition basis and systems using a concatenation of these matrices. Results are presented on all LRE 2009 test segments, as well as only on the CTS or only on the VOA test utterances. Since performances on all 23 languages are not trivial to compare, due to lacking language–channel combinations in the training and also in the testing data, all systems are also evaluated in the context of the subset of 8 common languages. Addressing the question if a fusion of two channel specific systems may be more beneficial than putting all data together, we study an oracle based system selector. On the 8 language subset, a pure CTS system performs at a minimal average cost of 2.7\% and pure VOA at 1.9\% minCavg on their respective test conditions. The fusion of these two systems runs at 2.0\% minCavg. As main observation, we see that the way we estimate the session compensation matrix has not a big influence, as long as the language–channel combinations cover those used for training the language models. Far more crucial is the kind of data used for model estimation.",
booktitle = "Odyssey 2010, The Speaker and Language Recognition Workshop",
crossref = " ",
editor = " ",
keywords = "Benchmarking, Biometrics, Speaker Verification",
month = "jun",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "230-237",
publisher = " ",
series = " ",
title = "{C}oping with {T}wo {D}ifferent {T}ransmission {C}hannels in {L}anguage {R}ecognition",
volume = " ",
year = "2010",
}
• M. E. Betjali, J. Bloeche, A. Humm, R. Ingold, and J. Hennebert, "Labeled Images Verification Using Gaussian Mixture Models," in 24th Annual ACM Symposium on Applied Computing (ACM SAC 09), Honolulu, USA, March 8 - 12, 2009, p. 1331–1336.
[Bibtex]
@conference{betj09:acmsac,
author = "Micheal El Betjali and Jean-Luc Bloeche and Andreas Humm and Rolf Ingold and Jean Hennebert",
abstract = "We are proposing in this paper an automated system to verify that images are correctly associated to labels. The novelty of the system is in the use of Gaussian Mixture Models (GMMs) as statistical modeling scheme as well as in several improvements introduced specifically for the verification task. Our approach is evaluated using the Caltech 101 database. Starting from an initial baseline system providing an equal error rate of 27.4\%, we show that the rate of errors can be reduced down to 13\% by introducing several optimizations of the system. The advantage of the approach lies in the fact that basically any object can be generically and blindly modeled with limited supervision. A potential target application could be a post-filtering of images returned by search engines to prune out or reorder less relevant images.",
booktitle = "24th Annual ACM Symposium on Applied Computing (ACM SAC 09), Honolulu, USA, March 8 - 12",
crossref = " ",
doi = "10.1145/1529282.1529581",
editor = " ",
isbn = "9781605581668",
keywords = "image recognition, gmm",
month = " March",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "1331--1336",
publisher = " ",
series = " ",
title = "{L}abeled {I}mages {V}erification {U}sing {G}aussian {M}ixture {M}odels",
volume = " ",
year = "2009",
}
• J. Hennebert, Vers La Business Intelligence Environnementale, 2009.
[Bibtex]
@misc{henn09:ibcom,
author = "Jean Hennebert",
abstract = "Les compagnies ne connaissent g{\'e}n{\'e}ralement pas leur impact direct et indirect sur l'environnement. De telles informations deviennent aujourd'hui strat{\'e}giques pour trois raisons. Premi{\e}rement, les soci{\'e}t{\'e}s recherchent des solutions pour identifier les priorit{\'e}s d'action. Deuxi{\e}mement, elles veulent pr{\'e}venir les risques, par exemple en anticipant les d{\'e}cisions des l{\'e}gislations qui sont en train de se mettre en place. Finalement, elles veulent pouvoir communiquer l'efficience de leurs actions et se comparer {\a} la concurrence.",
howpublished = "IBCOM market.ch, p. 7",
month = "sep",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
title = "{V}ers {L}a {B}usiness {I}ntelligence {E}nvironnementale",
year = "2009",
}
• A. Humm, R. Ingold, and J. Hennebert, "Spoken Handwriting for User Authentication using Joint Modelling Systems," in Proceedings of the 6th International Symposium on Image and Signal Processing and Analysis (ISPA 09), Salzburg, Austria, September 16 - 18, 2009, pp. 505-510.
[Bibtex]
@conference{humm09:ispa,
author = "Andreas Humm and Rolf Ingold and Jean Hennebert",
abstract = "We report on results obtained with a new user authentication system based on a combined acquisition of online pen and speech signals. In our approach, the two modalities are recorded by simply asking the user to say what she or he is simultaneously writing. The main benefit of this methodology lies in the simultaneous acquisition of two sources of biometric information with a better accuracy at no extra cost in terms of time or inconvenience. Another benefit comes from an increased difficulty for forgers willing to perform imitation attacks as two signals need to be reproduced. Our first strategy was to model independently both streams of data and to perform a fusion at the score level using state-of-the-art modelling tools and training algorithms. We report here on a second strategy, complementing the first one and aiming at modelling both streams of data jointly. This approach uses a recognition system to compute the forced alignment of Hidden Markov Models (HMMs). The system then tries to determine synchronization patterns using these two alignments of handwriting and speech and computes a new score according to these patterns. In this paper, we present these authentication systems with the focus on the joint modelling. The evaluation is performed on MyIDea, a realistic multimodal biometric database. Results show that a combination of the different modelling strategies (independent and joint) can improve the system performance on spoken handwriting data.",
booktitle = "Proceedings of the 6th International Symposium on Image and Signal Processing and Analysis (ISPA 09), Salzburg, Austria, September 16 - 18",
crossref = " ",
editor = " ",
isbn = "9789531841351",
issn = "1845-5921",
keywords = "biometrics, speech, writer, fusion",
month = " ",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = " 505-510",
publisher = " ",
series = " ",
title = "{S}poken {H}andwriting for {U}ser {A}uthentication using {J}oint {M}odelling {S}ystems",
volume = " ",
year = "2009",
}
• A. Humm, J. Hennebert, and R. Ingold, "Combined Handwriting And Speech Modalities For User Authentication, TSMCA," IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans, vol. 39, iss. 1, p. 25–35, 2009.
[Bibtex]
@article{humm09:TSMCA,
author = "Andreas Humm and Jean Hennebert and Rolf Ingold",
abstract = "In this paper we report on the development of an efficient user authentication system based on a combined acquisition of online pen and speech signals. The novelty of our approach is in the simultaneous recording of these two modalities, simply asking the user to utter what she/he is writing. The main benefit of this multimodal approach is a better accuracy at no extra costs in terms of access time or inconvenience. Another benefit comes from an increased difficulty for forgers willing to perform imitation attacks as two signals need to be reproduced. We are comparing here two potential scenarios of use. The first one is called spoken signatures where the user signs and says the content of the signature. The second scenario is based on spoken handwriting where the user is prompted to write and read the content of sentences randomly extracted from a text. Data according to these two scenarios have been recorded from a set of 70 users. In the first part of the paper, we describe the acquisition procedure and we comment on the viability and usability of such simultaneous recordings. Our conclusions are supported by a short survey performed with the users. In the second part, we present the authentication systems that we have developed for both scenarios. More specifically, our strategy was to model independently both streams of data and to perform a fusion at the score level. Starting from a state-of-the-art modelling algorithm based on Gaussian Mixture Models (GMMs) trained with an Expectation Maximization (EM) procedure, we report on several significant improvements that are brought. As a general observation, the use of both modalities outperforms significantly the modalities used alone.",
crossref = " ",
doi = "10.1109/TSMCA.2008.2007978",
issn = "1083-4427",
journal = "IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans",
keywords = "biometrics, handwriting, speech",
month = "January",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers. ",
number = "1",
pages = "25--35",
title = "{C}ombined {H}andwriting {A}nd {S}peech {M}odalities {F}or {U}ser {A}uthentication, {TSMCA}",
volume = "39",
year = "2009",
}
• S. Kanoun, F. Slimane, H. Guesmi, R. Ingold, A. Alimi, and J. Hennebert, "Affixal Approach versus Analytical Approach for Off-Line Arabic Decomposable Vocabulary Recognition," in International Conference on Document Analysis and Recognition (ICDAR 09), July 26 - 29, Barcelona, Spain, 2009, p. 661–665.
[Bibtex]
@conference{kano09:icdar,
author = "Slim Kanoun and Fouad Slimane and Han{\^e}ne Guesmi and Rolf Ingold and Adel Alimi and Jean Hennebert",
abstract = "In this paper, we propose a comparative study between the affixal approach and the analytical approach for off-line Arabic decomposable word recognition. The analytical approach is based on the modeling of alphabetical letters. The affixal approach is based on the modeling of the linguistic entity namely prefix, infix, suffix and root. The experimental results obtained by these two last approaches are presented on the basis of the printed decomposable word data set in mono-font nature by varying the character sizes. We achieve then our paper by the current improvements of our works concerning the Arabic multi-font, multi-style and multi-size word recognition.",
booktitle = "International Conference on Document Analysis and Recognition (ICDAR 09), July 26 - 29, Barcelona, Spain",
crossref = " ",
doi = "10.1109/ICDAR.2009.264",
editor = " ",
isbn = "9781424445004",
issn = "1520-5363",
keywords = "Character recognition , Image analysis , Informatics , Information analysis , Information systems , Machine intelligence , Neural networks , Shape , Text analysis , Vocabulary",
month = "July",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "661--665",
publisher = " ",
series = " ",
title = "{A}ffixal {A}pproach versus {A}nalytical {A}pproach for {O}ff-{L}ine {A}rabic {D}ecomposable {V}ocabulary {R}ecognition",
url = "http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5277473",
volume = " ",
year = "2009",
}
• A. Mayoue, B. Dorizzi, L. Allano, G. Chollet, J. Hennebert, D. Petrovska, and F. Verdet, "BioSecure Multimodal Evaluation Campaign 2007 (BMEC 2007)," in Guide to Biometric Reference Systems and Performance Evaluation, D. Petrovska, G. Chollet, and B. Dorizzi, Eds., Springer, 2009, pp. 327-371.
[Bibtex]
@inbook{mayo09:bmec,
author = "Aur{\'e}lien Mayoue and Bernadette Dorizzi and Lorene Allano and G{\'e}rard Chollet and Jean Hennebert and Dijana Petrovska and Florian Verdet",
abstract = "Chapter about a large-scale Multimodal Evaluation Campaign held in 2007 in the framework of the European BioSecure project. The book title is "Guide to Biometric Reference Systems and Performance Evaluation"",
booktitle = "Guide to Biometric Reference Systems and Performance Evaluation",
chapter = "11",
crossref = " ",
doi = "10.1007/978-1-84800-292-0_11",
editor = "Petrovska, Dijana and Chollet, G{\'e}rard and Dorizzi, Bernadette",
isbn = "9781848002913",
key = " ",
keywords = "Benchmarking, Biometrics",
month = " ",
Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
organization = " ",
pages = "327-371",
publisher = "Springer",
series = "Guide to Biometric Reference Systems and Performance Evaluation",
title = "{B}io{S}ecure {M}ultimodal {E}valuation {C}ampaign 2007 ({BMEC} 2007)",
url = "http://rd.springer.com/chapter/10.1007/978-1-84800-292-0_11",
year = "2009",
}
• F. Slimane, R. Ingold, S. Kanoun, A. Alimi, and J. Hennebert, "A New Arabic Printed Text Image Database and Evaluation Protocols," in International Conference on Document Analysis and Recognition (ICDAR 09), July 26 - 29, Barcelona, Spain, 2009, p. 946–950.
[Bibtex]
@conference{slim09:icdar,
author = "Fouad Slimane and Rolf Ingold and Slim Kanoun and Adel Alimi and Jean Hennebert",
abstract = "We report on the creation of a database composed of images of Arabic Printed words. The purpose of this database is the large-scale benchmarking of open-vocabulary, multi-font, multi-size and multi-style text recognition systems in Arabic. The challenges that are addressed by the database are in the variability of the sizes, fonts and style used to generate the images. A focus is also given on low-resolution images where anti-aliasing is generating noise on the characters to recognize. The database is synthetically generated using a lexicon of 113’284 words, 10 Arabic fonts, 10 font sizes and 4 font styles. The database contains 45’313’600 single word images totaling to more than 250 million characters. Ground truth annotation is provided for each image. The database is called APTI for Arabic Printed Text Images.",
booktitle = "International Conference on Document Analysis and Recognition (ICDAR 09), July 26 - 29, Barcelona, Spain",
crossref = " ",
doi = "10.1109/ICDAR.2009.155",
editor = " ",
isbn = "9781424445004",
issn = "1520-5363",
keywords = "arabic, machine learning, OCR",
month = " ",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "946--950",
publisher = " ",
series = " ",
title = "{A} {N}ew {A}rabic {P}rinted {T}ext {I}mage {D}atabase and {E}valuation {P}rotocols",
volume = " ",
year = "2009",
}
• F. Slimane, S. Kanoun, J. Hennebert, A. M. Alimi, and R. Ingold, "Modèles de Markov Cachés et Modèle de Longueur pour la Reconnaissance de l'Ecriture Arabe à Basse Résolution," in Proceedings of MAnifestation des JEunes Chercheurs en Sciences et Technologies de l'Information et de la Communication (MajecSTIC 2009), Avignon (France), 2009.
[Bibtex]
@conference{slim09:majestic,
author = "Fouad Slimane and Slim Kanoun and Jean Hennebert and Adel M. Alimi and Rolf Ingold",
abstract = "Nous pr{\'e}sentons dans ce papier un syst{\e}me de reconnaissance automatique de l’{\'e}criture arabe {\a} vocabulaire ouvert, basse r{\'e}solution, bas{\'e} sur les Mod{\e}les de Markov Cach{\'e}s. De tels mod{\e}les sont tr{\e}s performants lorsqu’il s’agit de r{\'e}soudre le double probl{\e}me de segmentation et de reconnaissance pour des signaux correspondant {\a} des s{\'e}quences d’{\'e}tats diff{\'e}rents, par exemple en reconnaissance de la parole ou de l’{\'e}criture cursive. La sp{\'e}cificit{\'e} de notre ap- proche est dans l’introduction des mod{\e}les de longueurs pour la reconnaissance de l’Arabe imprim{\'e}. Ces derniers sont inf{\'e}r{\'e}s automatiquement pendant la phase d’entra{\^i}nement et leur impl{\'e}mentation est r{\'e}alis{\'e}e par une simple alt{\'e}ration des mod{\e}les de chaque caract{\e}re composant les mots. Dans notre approche, chaque mot est repr{\'e}sent{\'e} par une s{\'e}quence des sous mod{\e}les, ces derniers {\'e}tant repr{\'e}sent{\'e}s par des {\'e}tats dont le nombre est proportionnel {\a} la longueur de chaque caract{\e}re. Cette am{\'e}lioration, nous a permis d’augmenter de fa{\c{c}}on significative les performances de reconnaissance et de d{\'e}velopper un syst{\e}me de reconnaissance {\a} vocabulaire ouvert. L’{\'e}valuation du syst{\e}me a {\'e}t{\'e} effectu{\'e}e en utilisant la boite {\a} outils HTK sur une base de donn{\'e}es d’images synth{\'e}tique {\a} basse r{\'e}solution.",
booktitle = "Proceedings of MAnifestation des JEunes Chercheurs en Sciences et Technologies de l'Information et de la Communication (MajecSTIC 2009)",
isbn = "9782953423310",
keywords = "HMM, arabic recognition, image recognition, HMM, duration model",
month = "nov",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
title = "{M}od{\e}les de {M}arkov {C}ach{\'e}s et {M}od{\e}le de {L}ongueur pour la {R}econnaissance de l'{E}criture {A}rabe {\a} {B}asse {R}{\'e}solution",
year = "2009",
}
• F. Slimane, R. Ingold, S. Kanoun, A. Alimi, and J. Hennebert, "Database and Evaluation Protocols for Arabic Printed Text Recognition," University of Fribourg, Department of Informatics, 296-09-01, 2009.
[Bibtex]
@techreport{slim09:tr296,
author = "Fouad Slimane and Rolf Ingold and Slim Kanoun and Adel Alimi and Jean Hennebert",
abstract = "We report on the creation of a database composed of images of Arabic Printed Text. The purpose of this database is the large-scale benchmarking of open-vocabulary, multi-font, multi-size and multi-style text recognition systems in Arabic. Such systems take as input a text image and compute as output a character string corresponding to the text included in the image. The database is called APTI for Arabic Printed Text Image. The challenges that are addressed by the database are in the variability of the sizes, fonts and style used to generate the images. A focus is also given on low-resolution images where anti-aliasing is generating noise on the characters to recognize. The database is synthetically generated using a lexicon of 113’284 words, 10 Arabic fonts, 10 font sizes and 4 font styles. The database contains 45’313’600 single word images totaling to more than 250 million characters. Ground truth annotation is provided for each image thanks to a XML file. The annotation includes the number of characters, the number of PAWs (Pieces of Arabic Word), the sequence of characters, the size, the style, the font used to generate each image, etc.",
institution = "University of Fribourg, Department of Informatics",
keywords = "database, arabic, image, text recognition",
month = " ",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = "296-09-01",
title = "{D}atabase and {E}valuation {P}rotocols for {A}rabic {P}rinted {T}ext {R}ecognition",
type = " ",
year = "2009",
}
• F. Verdet, D. Matrouf, J. Bonastre, and J. Hennebert, "Factor Analysis and SVM for Language Recognition," in 10th Annual Conference of the International Speech Communication Association, InterSpeech, 2009, p. 164–167.
[Bibtex]
@conference{verd09:interspeech,
author = "Florian Verdet and Driss Matrouf and Jean-Fran{\c{c}}ois Bonastre and Jean Hennebert",
abstract = "Statistic classifiers operate on features that generally include both, useful and useless information. These two types of information are difficult to separate in feature domain. Recently, a new paradigm based on Factor Analysis (FA) proposed a model decomposition into useful and useless components. This method has successfully been applied to speaker recognition tasks. In this paper, we study the use of FA for language recognition. We propose a classification method based on SDC features and Gaussian Mixture Models (GMM). We present well performing systems using Factor Analysis and FA-based Support Vector Machine (SVM) classifiers. Experiments are conducted using NIST LRE 2005’s primary condition. The relative equal error rate reduction obtained by the best factor analysis configuration with respect to baseline GMM-UBM system is over 60 \%, corresponding to an EER of 6.59 \%.",
booktitle = "10th Annual Conference of the International Speech Communication Association, InterSpeech",
crossref = " ",
editor = " ",
issn = "1990-9772",
keywords = "Language Identification, Speech Processing",
month = "sep",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "164--167",
publisher = " ",
series = " ",
title = "{F}actor {A}nalysis and {SVM} for {L}anguage {R}ecognition",
volume = " ",
year = "2009",
}
• F. Einsele, R. Ingold, and J. Hennebert, "A Language-Independent, Open-Vocabulary System Based on HMMs for Recognition of Ultra Low Resolution Words," Journal of Universal Computer Science, vol. 14, iss. 18, p. 2982–2997, 2008.
[Bibtex]
@article{einse08:jucs,
author = "Farshideh Einsele and Rolf Ingold and Jean Hennebert",
abstract = "In this paper, we introduce and evaluate a system capable of recognizing words extracted from ultra low resolution images such as those frequently embedded on web pages. The design of the system has been driven by the following constraints. First, the system has to recognize small font sizes between 6-12 points where anti-aliasing and resampling filters are applied. Such procedures add noise between adjacent characters in the words and complicate any a priori segmentation of the characters. Second, the system has to be able to recognize any words in an open vocabulary setting, potentially mixing different languages in Latin alphabet. Finally, the training procedure must be automatic, i.e. without requesting to extract, segment and label manually a large set of data. These constraints led us to an architecture based on ergodic HMMs where states are associated to the characters. We also introduce several improvements of the performance increasing the order of the emission probability estimators, including minimum and maximum width constraints on the character models and a training set consisting all possible adjacency cases of Latin characters. The proposed system is evaluated on different font sizes and families, showing good robustness for sizes down to 6 points.",
crossref = " ",
doi = "10.3217/jucs-014-18-2982",
issn = "0948-6968",
journal = "Journal of Universal Computer Science",
keywords = "Text recognition, low-resolution images",
month = " October",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = "18",
pages = "2982--2997",
title = "{A} {L}anguage-{I}ndependent, {O}pen-{V}ocabulary {S}ystem {B}ased on {HMM}s for {R}ecognition of {U}ltra {L}ow {R}esolution {W}ords",
volume = "14",
year = "2008",
}
• F. Einsele, R. Ingold, and J. Hennebert, "A Language-Independent, Open-Vocabulary System Based on HMMs for Recognition of Ultra Low Resolution Words," in 23rd Annual ACM Symposium on Applied Computing (ACM SAC 2008), Fortaleza, Ceara, Brasil, 2008, p. 429–433.
[Bibtex]
@conference{eins08:sac,
author = "Farshideh Einsele and Rolf Ingold and Jean Hennebert",
abstract = "In this paper, we introduce and evaluate a system capable of recognizing ultra low resolution words extracted from images such as those frequently embedded on web pages. The design of the system has been driven by the following constraints. First, the system has to recognize small font sizes where anti-aliasing and resampling procedures have been applied. Such procedures add noise on the patterns and complicate any a priori segmentation of the characters. Second, the system has to be able to recognize any words in an open vocabulary setting, potentially mixing different languages. Finally, the training procedure must be automatic, i.e. without requesting to extract, segment and label manually a large set of data. These constraints led us to an architecture based on ergodic HMMs where states are associated to the characters. We also introduce several improvements of the performance increasing the order of the emission probability estimators and including minimum and maximum duration constraints on the character models. The proposed system is evaluated on different font sizes and families, showing good robustness for sizes down to 6 points.",
booktitle = "23rd Annual ACM Symposium on Applied Computing (ACM SAC 2008), Fortaleza, Ceara, Brasil",
crossref = " ",
doi = "10.1145/1363686.1363791",
editor = " ",
isbn = "9781595937537",
keywords = "HMM, OCR",
month = " ",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "429--433",
publisher = " ",
series = " ",
title = "{A} {L}anguage-{I}ndependent, {O}pen-{V}ocabulary {S}ystem {B}ased on {HMM}s for {R}ecognition of {U}ltra {L}ow {R}esolution {W}ords",
volume = " ",
year = "2008",
}
• A. E. Hannani and J. Hennebert, "A Review of the Benefits and Issues of Speaker Verification Evaluation Campaigns," in Proceedings of the ELRA Workshop on Evaluation at LREC 08, Marrakech, Morocco, 2008, p. 29–34.
[Bibtex]
@conference{elha08:elra,
author = "Asmaa El Hannani and Jean Hennebert",
abstract = "Evaluating speaker verification algorithms on relevant speech corpora is a key issue for measuring the progress and discovering the remaining difficulties of speaker verification systems. A common evaluation framework is also a key point when comparing systems produced by different labs. The speech group of the National Institute of Standards and Technology (NIST) has been organizing evaluations of text-independent telephony speaker verification technologies since 1997, with an increasing success and number of participants over the years. These NIST evaluations have been recognized by the speaker verification scientific community as a key factor for the improvement of the algorithms over the last decade. However, these evaluations measure exclusively the effectiveness in term of performance of the systems, assuming some conditions of use that are sometimes far away from any real-life commercial context for telephony applications. Other important aspects of speaker verification systems are also ignored by such evaluations, such as the efficiency, the usability and the robustness of the systems against impostor attacks. In this paper we present a review of the current NIST speaker verification evaluation methods, trying to put objectively into evidence their current benefits and limitations. We also propose some concrete solutions for going beyond these limitations.",
booktitle = "Proceedings of the ELRA Workshop on Evaluation at LREC 08, Marrakech, Morocco",
crossref = " ",
editor = " ",
keywords = "speaker verification, benchmarks",
month = " ",
note = "http://www.lrec-conf.org/proceedings/lrec2008/
Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "29--34",
publisher = " ",
series = " ",
title = "{A} {R}eview of the {B}enefits and {I}ssues of {S}peaker {V}erification {E}valuation {C}ampaigns",
volume = " ",
year = "2008",
}
• B. Fauve, H. Bredin, W. Karam, F. Verdet, A. Mayoue, G. Chollet, J. Hennebert, R. Lewis, J. Mason, C. Mokbel, and D. Petrovska, "Some Results from the BioSecure Talking-Face Evaluation Campaign," in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Las Vegas, Nevada, USA, 30/03/08-04/04/08, http://www.ieee.org/, 2008, p. 4137–4140.
[Bibtex]
@conference{fauv08:icassp,
author = "Benoit Fauve and Herv{\'e} Bredin and Walid Karam and Florian Verdet and Aur{\'e}lien Mayoue and G{\'e}rard Chollet and Jean Hennebert and Richard Lewis and John Mason and Chafik Mokbel and Dijana Petrovska",
abstract = "The BioSecure Network of Excellence1 has collected a large multi-biometric publicly available database and organized the BioSecure Multimodal Evaluation Campaigns (BMEC) in 2007. This paper reports on the Talking Faces campaign. Open source reference systems were made available to participants and four laboratories submitted executable code to the organizer who performed tests on sequestered data. Several deliberate impostures were tested. It is demonstrated that forgeries are a real threat for such systems. A technological race is ongoing between deliberate impostors and system developers.",
booktitle = "IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Las Vegas, Nevada, USA, 30/03/08-04/04/08",
crossref = " ",
doi = "10.1109/ICASSP.2008.4518565",
editor = " ",
isbn = "9781424414833",
keywords = "biometrics, talking face, evaluation, benchmarking",
month = " ",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "4137--4140",
publisher = "IEEE",
series = " ",
title = "{S}ome {R}esults from the {B}io{S}ecure {T}alking-{F}ace {E}valuation {C}ampaign",
volume = " ",
year = "2008",
}
• J. Hennebert, "Speaker Verification," in Biometrics And Human Identity, V. M. Roman Rak and Z. Riha, Eds., Grada, 2008.
[Bibtex]
@incollection{henn08:speak,
author = "Jean Hennebert",
abstract = "Speaking is the most natural mean of communication between humans. Driven by a great deal of potential applications in human-machine interaction, systems have been developed to automatically extract the different pieces of information conveyed in the speech signal. There are three major tasks. In speech recognition tasks, the automatic system aims at discovering the sequence of words forming the spoken message. In language recognition tasks, the system attempts to identify the language used in a given piece of speech signal. Finally, speaker recognition systems aim to discover information about the identity of the speaker. Speaker recognition finds applications in many different areas such as access control, transaction authentication, law enforcement, speech data management and personalization. As for other biometric technologies the prime motivation of speaker recognition is to achieve a more usable and reliable personal identification than by using artifacts such as keys, badges, magnetic cards or memorized passwords. Interestingly, speaker recognition is one of the few biometric approach which is not based on image processing. Speaker recognition systems are often said to be performance-based since the user has to produce a sequence of sound. This is also a major difference with other passive biometrics for which the cooperation of the authenticated person is not requested, such as for fingerprints, iris or face recognition systems. Speaker recognition technologies are often ranked as less accurate than other biometric technologies such as finger print or iris scan. However, there are two main factors that make voice a compelling biometric. First, there is a proliferation of automated telephony services for which speaker recognition can be directly applied. Telephone handsets are indeed available basically everywhere and provide the required sensors for the speech signal. Second, talking is a very natural gesture, often considered as lowly intrusive by users as no physical contact is requested. These two factors, added to the recent scientific progresses, made voice biometric converge into a mature technology. Commercial products offering voice biometric are now available from different vendors. However, many technical and non-technical issues, discussed later in this chapter, still remain open and need to be tackled.",
booktitle = "Biometrics And Human Identity",
crossref = " ",
editor = "Roman Rak, V{\'a}clav Maty{\'a}s and Riha, Zdenek",
isbn = "9788024723655",
key = " ",
keywords = "Biometrics, machine learning, Speaker Verification",
month = " ",
note = "Book title: Biometrics And Human Identity
ISBN-13: 978-80-247-2365-5",
organization = " ",
pages = " ",
title = "{S}peaker {V}erification",
url = "http://spolecenskeknihy.cz/?id=978-80-247-2365-5{{{\&}}}p=4",
year = "2008",
}
• J. Hennebert, "Encyclopedia of Biometrics, Speaker Recognition Overview," , S. Li, Ed., Springer, 2009, vol. 2, pp. 1262-1270.
[Bibtex]
@inbook{henn09:enc,
author = "Jean Hennebert",
abstract = "Speaker recognition is the task of recognizing people from their voices. Speaker recognition is based on the extraction and modeling of acoustic features of speech that can differentiate individuals. These features conveys two kinds of biometric information: physiological properties (anatomical conﬁguration of the vocal apparatus) and behavioral traits (speaking style). Automatic speaker recognition technology declines into four major tasks, speaker identiﬁcation, speaker veriﬁcation, speaker
segmentation and speaker tracking. While these tasks are quite different by their potential applications, the underlying technologies are yet closely related.",
chapter = " ",
edition = " ",
editor = "Li, Stan",
isbn = "9780387730028",
keywords = "Biometrics, Speaker Verification",
month = " ",
note = "http://www.springer.com/computer/computer+imaging/book/978-0-387-73002-8
Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.
number = " ",
pages = "1262-1270",
publisher = "Springer",
series = "Springer Reference",
title = "{E}ncyclopedia of {B}iometrics, {S}peaker {R}ecognition {O}verview",
type = " ",
volume = "2",
year = "2009",
}
• A. Humm, J. Hennebert, and R. Ingold, "Spoken Signature For User Authentication," SPIE Journal of Electronic Imaging, Special Section on Biometrics: ASUI January-March 2008, vol. 17, iss. 1, p. 011013-1–011013-11, 2008.
[Bibtex]
@article{humm08:spie,
author = "Andreas Humm and Jean Hennebert and Rolf Ingold",
abstract = "We are proposing a new user authentication system based on spoken signatures where online signature and speech signals are acquired simultaneously. The main benefit of this multimodal approach is a better accuracy at no extra costs for the user in terms of access time or inconvenience. Another benefit lies in a better robustness against intentional forgeries due to the extra difficulty for the forger to produce both signals. We have set up an experimental framework to measure these benefits on MyIDea, a realistic multimodal biometric database publicly available. More specifically, we evaluate the performance of state-of-the-art modelling systems based on GMM and HMM applied independently to the pen and voice signal where a simple rule-based score fusion procedure is used. We conclude that the best performance is achieved by the HMMs, provided that their topology is optimized on a per user basis. Furthermore, we show that more precise models can be obtained through the use of Maximum a posteriori probability (MAP) training instead of the classically used Expectation Maximization (EM). We also measure the impact of multi-session scenarios versus mono-session scenarios and the impact of skilled versus unskilled signature forgeries attacks.",
crossref = " ",
doi = "10.1117/1.2898526",
issn = "1017-9909",
journal = "SPIE Journal of Electronic Imaging, Special Section on Biometrics: ASUI January-March 2008",
keywords = "biometrics, speech, signature",
month = "April",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = "1",
pages = "011013-1--011013-11",
title = "{S}poken {S}ignature {F}or {U}ser {A}uthentication",
volume = "17",
year = "2008",
}
• F. Slimane, R. Ingold, A. M. Alimi, and J. Hennebert, "Duration Models for Arabic Text Recognition using Hidden Markov Models," in International Conference on Computational Intelligence for Modelling, Control and Automation (CIMCA 08), Vienna, Austria, 2008, p. 838–843.
[Bibtex]
@conference{slim08:cimca,
author = "Fouad Slimane and Rolf Ingold and Adel Mohamed Alimi and Jean Hennebert",
abstract = "We present in this paper a system for recognition of printed Arabic text based on Hidden Markov Models (HMM). While HMMs have been successfully used in the past for such a task, we report here on significant improvements of the recognition performance with the introduction of minimum and maximum duration models. The improvements allow us to build a system working in open vocabulary mode, i.e., without any limitations on the size of the vocabulary. The evaluation of our system is performed using HTK (Hidden Markov Model Toolkit) on a database of word images that are synthetically generated",
booktitle = "International Conference on Computational Intelligence for Modelling, Control and Automation (CIMCA 08), Vienna, Austria",
crossref = " ",
doi = "10.1109/CIMCA.2008.229",
editor = " ",
isbn = "9780769535142",
keywords = "hidden Markov models , image recognition , text analysis , visual databases",
month = "December",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "838--843",
publisher = " ",
series = " ",
title = "{D}uration {M}odels for {A}rabic {T}ext {R}ecognition using {H}idden {M}arkov {M}odels",
volume = " ",
year = "2008",
}
• F. Verdet and J. Hennebert, "Impostures of Talking Face Systems Using Automatic Face Animation," in IEEE Conference on Biometrics: Theory, Applications and Systems (BTAS 08), Arlington, Virginia, USA, 2008, pp. 1-4.
[Bibtex]
@conference{verd08:btas,
author = "Florian Verdet and Jean Hennebert",
abstract = "We present in this paper a new forgery scenario for the evaluation of face verification systems. The scenario is a replay-attack where we assume that the forger has got access to a still picture of the genuine user. The forger is then using a dedicated software to realistically animate the face image, reproducing head and lip movements according to a given speech waveform. The resulting forged video sequence is finally replayed to the sensor. Such attacks are nowadays quite easy to realize for potential forgers and can be opportunities to attempt to forge text-prompted challenge-response configurations of the verification system. We report the evaluation of such forgeries on the BioSecure BMEC talking face database where a set of 430 users are forged according to this face animation procedure. As expected, results show that these forgeries generate much more false acceptation in comparison to the classically used random forgeries. These results clearly show that such kind of forgery attack potentially represents a critical security breach for talking-face verification systems.",
booktitle = "IEEE Conference on Biometrics: Theory, Applications and Systems (BTAS 08), Arlington, Virginia, USA",
crossref = " ",
doi = "10.1109/BTAS.2008.4699367",
editor = " ",
isbn = "9781424427291",
keywords = "biometrics, talking face, benchmarking, forgeries",
month = "September",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "1-4",
publisher = " ",
series = " ",
title = "{I}mpostures of {T}alking {F}ace {S}ystems {U}sing {A}utomatic {F}ace {A}nimation",
volume = " ",
year = "2008",
}
• F. Einsele, J. Hennebert, and R. Ingold, "Towards Identification Of Very Low Resolution, Anti-Aliased Characters," in IEEE International Symposium on Signal Processing and its Applications (ISSPA'07), Sharjah, United Arab Emirates, 2007, pp. 1-4.
[Bibtex]
@conference{eins07:isspa,
author = "Farshideh Einsele and Jean Hennebert and Rolf Ingold",
abstract = "Current Web indexing technologies suffer from a severe drawback due to the fact that web documents often present textual information that is encapsulated in digital images and therefore not available as actual coded text. Moreover such images are not suited to be processed by existing OCR software, since they are generally designed for recognizing binary document images produced by scanners with resolutions between 200-600 dpi, whereas text embedded in web images is often anti-aliased and has generally a resolution between 72 and 90 dpi. The presented paper describes two preliminary studies about character identification at very low resolution (72 dpi) and small font sizes (3-12 pts). The proposed character identification system delivers identification rates up to 99.93 percents for 12'600 isolated character samples and up to 99.89 percents for 300'000 character samples in context.",
booktitle = "IEEE International Symposium on Signal Processing and its Applications (ISSPA'07), Sharjah, United Arab Emirates",
crossref = " ",
doi = "10.1109/ISSPA.2007.4555324",
editor = " ",
isbn = "9781424407781",
keywords = "OCR;Web indexing technology;antialaised character identification;binary document image recognition;low resolution character;textual information encapsulation;Internet;antialiasing;data encapsulation;document image processing;image resolution;indexing;",
month = "February",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = " 1-4",
publisher = " ",
series = " ",
title = "{T}owards {I}dentification {O}f {V}ery {L}ow {R}esolution, {A}nti-{A}liased {C}haracters",
volume = " ",
year = "2007",
}
• F. Einsele, R. Ingold, and J. Hennebert, "A HMM-Based Approach to Recognize Ultra Low Resolution Anti-Aliased Words," in Pattern Recognition and Machine Intelligence, A. Ghosh, R. De, and S. Pal, Eds., Springer Verlag, 2007, vol. 4815, pp. 511-518.
[Bibtex]
@inbook{eins07:premi,
author = "Farshideh Einsele and Rolf Ingold and Jean Hennebert",
abstract = "In this paper, we present a HMM based system that is used to recognize ultra low resolution text such as those frequently embedded in images available on the web. We propose a system that takes specifically the challenges of recognizing text in ultra low resolution images into account. In addition to this, we show in this paper that word models can be advantageously built connecting together sub-HMM-character models and inter-character state. Finally we report on the promising performance of the system using HMM topologies which have been improved to take into account the presupposed minimum length of each character.",
booktitle = "Pattern Recognition and Machine Intelligence",
doi = "10.1007/978-3-540-77046-6_63",
editor = "Ghosh, Ashish and De, Rajat and Pal, Sankar",
isbn = "9783540770459",
keywords = "HMM; OCR; Ultra-low resolution",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
pages = "511-518",
publisher = "Springer Verlag",
series = "Lecture Notes in Computer Science, Pattern Recognition and Machine Intelligence",
title = "{A} {HMM}-{B}ased {A}pproach to {R}ecognize {U}ltra {L}ow {R}esolution {A}nti-{A}liased {W}ords",
volume = "4815",
year = "2007",
}
• J. Hennebert, "Please repeat: my voice is my password. From the basics to real-life implementations of speaker verification technologies," in Invited lecture at the Information Security Summit (IS2 2007), Prague, 2007.
[Bibtex]
@conference{henn07:iss,
author = "Jean Hennebert",
abstract = "Speaker verification finds applications in many different areas such as access control, transaction authentication, law enforcement, speech data management and personalization. As for other biometric technologies the prime motivation of speaker recognition is to achieve a more usable and reliable personal identification than by using artifacts such as keys, badges, magnetic cards or memorized passwords. Speaker verification technologies are often ranked as less accurate than other biometric technologies such as iris scan or fingerprints. However, there are two main factors that make voice a compelling biometric. First, there is a proliferation of automated telephony services for which speaker recognition can be directly applied. Second, talking is a very natural gesture, often considered as lowly intrusive by users as no physical contact is requested. These two factors, added to the recent scientific progresses, made voice biometric converge into a mature technology.",
booktitle = "Invited lecture at the Information Security Summit (IS2 2007), Prague",
crossref = " ",
editor = " ",
keywords = "Biometrics; Speaker Verification",
month = "May",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = " ",
publisher = " ",
series = " ",
title = "{P}lease repeat: my voice is my password. {F}rom the basics to real-life implementations of speaker verification technologies",
volume = " ",
year = "2007",
}
• J. Hennebert, A. Humm, and R. Ingold, "Modelling Spoken Signatures With Gaussian Mixture Model Adaptation," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 07), 2007, pp. 229-232.
[Bibtex]
@conference{henn07:icassp,
author = "Jean Hennebert and Andreas Humm and Rolf Ingold",
abstract = "We report on our developments towards building a novel user authentication system using combined acquisition of online handwritten signature and speech modalities. In our approach, signatures are recorded by asking the user to say what she/he is writing, leading to the so-called spoken signatures. We have built a verification system composed of two Gaussian Mixture Models (GMMs) sub-systems that model independently the pen and voice signal. We report on results obtained with two algorithms used for training the GMMs, respectively Expectation Maximization and Maximum A Posteriori Adaptation. Different algorithms are also compared for fusing the scores of each modality. The evaluations are conducted on spoken signatures taken from the MyIDea multimodal database, accordingly to the protocols provided with the database. Results are in favor of using MAP adaptation with a simple weighted sum fusion. Results show also clearly the impact of time variability and of skilled versus unskilled forgeries attacks.",
booktitle = "IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 07)",
crossref = " ",
doi = "10.1109/ICASSP.2007.366214",
editor = " ",
isbn = "1424407273",
issn = "1520-6149",
keywords = "Biometrics; Signature; Speech; Handwriting; Multimodal; GMM",
month = " April",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = " 229-232",
publisher = " ",
series = " ",
title = "{M}odelling {S}poken {S}ignatures {W}ith {G}aussian {M}ixture {M}odel {A}daptation",
volume = "2 ",
year = "2007",
}
• J. Hennebert, R. Loeffel, A. Humm, and R. Ingold, "A New Forgery Scenario Based On Regaining Dynamics Of Signature," in Advances in Biometrics, S. L. S. L. S. Verlag, Ed., Lecture Notes in Computer Science, Advances in Biometrics, 2007, vol. 4642, pp. 366-375.
[Bibtex]
@inbook{henn07:icb,
author = "Jean Hennebert and Renato Loeffel and Andreas Humm and Rolf Ingold",
abstract = "We present in this paper a new forgery scenario for dynamic signature verification systems. In this scenario, we assume that the forger has got access to a static version of the genuine signature, is using a dedicated software to automatically recover dynamics of the signature and is using these regained signatures to break the verification system. We also show that automated procedures can be built to regain signature dynamics, making some simple assumptions on how signatures are performed. We finally report on the evaluation of these procedures on the MCYT-100 signature database on which regained versions of the signatures are generated. This set of regained signatures is used to evaluate the rejection performance of a baseline dynamic signature verification system. Results show that the regained forgeries generate much more false acceptation in comparison to the random and low-force forgeries available in the MCYT-100 database. These results clearly show that such kind of forgery attacks can potentially represent a critical security breach for signature verification systems.",
chapter = "ICB 2007",
doi = "10.1007/978-3-540-74549-5",
editor = "Seong-Whan Lee; Stan Li; Springer Verlag",
isbn = "9783540745488",
keywords = "Biometrics; Signature; Forgeries",
month = "August",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
pages = "366-375",
publisher = "Lecture Notes in Computer Science, Advances in Biometrics",
title = "{A} {N}ew {F}orgery {S}cenario {B}ased {O}n {R}egaining {D}ynamics {O}f {S}ignature",
volume = "4642",
year = "2007",
}
• A. Humm, J. Hennebert, and R. Ingold, "Spoken Handwriting Verification using Statistical Models," in Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02, ICDAR'07, Washington, DC, USA, 2007, pp. 999-1003.
[Bibtex]
@conference{humm07:icdar,
author = "Andreas Humm and Jean Hennebert and Rolf Ingold",
abstract = "We are proposing a novel and efficient user authentication system using combined acquisition of online handwriting and speech signals. In our approach, signals are recorded by asking the user to say what she or he is simultaneously writing. This methodology has the clear advantage of acquiring two sources of biometric information at no extra cost in terms of time or inconvenience. We have built a straightforward verification system to model these signals using statistical models. It is composed of two Gaussian Mixture Models (GMMs) sub-systems that takes as input features extracted from the pen and voice signals. The system is evaluated on MyIdea, a realistic multimodal biometric database. Results show that the use of both speech and handwriting modalities outperforms significantly these modalities used alone. We also report on the evaluations of different training algorithms and fusion strategies.",
address = " Washington, DC, USA",
booktitle = "Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02, ICDAR'07",
crossref = " ",
doi = "10.1109/ICDAR.2007.4377065",
editor = " ",
isbn = "9780769528229",
issn = "1520-5363",
keywords = "Biometrics; Signature; Speech; Handwriting; Multimodal",
month = " September",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "999-1003",
publisher = "IEEE Computer Society",
title = "{S}poken {H}andwriting {V}erification using {S}tatistical {M}odels",
volume = "2",
year = "2007",
}
• A. Humm, J. Hennebert, and R. Ingold, "Hidden Markov Models for Spoken Signature Verification," in Biometrics: Theory, Applications, and Systems, 2007. BTAS 2007. First IEEE International Conference on, 2007, pp. 1-6.
[Bibtex]
@conference{humm07:btas,
author = "Andreas Humm and Jean Hennebert and Rolf Ingold",
abstract = "In this paper we report on the developments of an efficient user authentication system using combined acquisition of online signature and speech modalities. In our project, these two modalities are simultaneously recorded by asking the user to utter what she/he is writing. The main benefit of this multimodal approach is a better accuracy at no extra costs in terms of access time or inconvenience. More specifically, we report in this paper on significant improvements of our initial system that was based on Gaussian Mixture Models (GMMs) applied independently to the pen and voice signal. We show that the GMMs can be advantageously replaced by Hidden Markov Models (HMMs) provided that the number of state used for the topology is optimized and provided that the model parameters are trained with a Maximum a Posteriori (MAP) adaptation procedure instead of the classically used Expectation Maximization (EM). The evaluations are conducted on spoken signatures taken from the MyIDea multimodal database. Consistently with our previous evaluation of the GMM system, we observe for the HMM system that the use of both speech and handwriting modalities outperforms significantly these modalities used alone. We also report on the evaluations of different score fusion strategies.",
booktitle = "Biometrics: Theory, Applications, and Systems, 2007. BTAS 2007. First IEEE International Conference on",
doi = "10.1109/BTAS.2007.4401960",
isbn = "9781424415977",
keywords = "MAP adaptation procedure;hidden Markov models;maximum a posteriori adaptation procedure;multimodal approach;online signature;speech modalities;spoken signature verification;user authentication system;biometrics",
month = "September",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
pages = "1 -6",
title = "{H}idden {M}arkov {M}odels for {S}poken {S}ignature {V}erification",
year = "2007",
}
• A. Humm, J. Hennebert, and R. Ingold, "Modelling Combined Handwriting And Speech Modalities," in International Conference on Biometrics (ICB 2007), Seoul Korea, S. Verlag, Ed., Lecture Notes in Computer Science, Advances in Biometrics, 2007, vol. 4642, pp. 1025-1034.
[Bibtex]
@inbook{humm07:icb,
author = "Andreas Humm and Jean Hennebert and Rolf Ingold",
abstract = "We are reporting on consolidated results obtained with a new user authentication system based on combined acquisition of online handwriting and speech signals. In our approach, signals are recorded by asking the user to say what she or he is simultaneously writing. This methodology has the clear advantage of acquiring two sources of biometric information at no extra cost in terms of time or inconvenience. We are proposing here two scenarios of use: spoken signature where the user signs and speaks at the same time and spoken handwriting where the user writes and says what is written. These two scenarios are implemented and fully evaluated using a verification system based on Gaussian Mixture Models (GMMs). The evaluation is performed on MyIdea, a realistic multimodal biometric database. Results show that the use of both speech and handwriting modalities outperforms significantly these modalities used alone, for both scenarios. Comparisons between the spoken signature and spoken handwriting scenarios are also drawn.",
booktitle = "International Conference on Biometrics (ICB 2007), Seoul Korea",
chapter = "ICB 2007",
crossref = " ",
doi = "10.1007/978-3-540-74549-5",
editor = " Springer Verlag",
isbn = "9783540745488",
keywords = "Biometrics; Signature; Speech; Handwriting; Multimodal",
month = "August",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "1025-1034",
publisher = "Lecture Notes in Computer Science, Advances in Biometrics",
title = "{M}odelling {C}ombined {H}andwriting {A}nd {S}peech {M}odalities",
volume = "4642",
year = "2007",
}
• A. E. Hannani, D. Toledano, D. Petrovska, A. Montero-Asenjo, and J. Hennebert, "Using Data-driven and Phonetic Units for Speaker Verification," in IEEE Speaker and Language Recognition Workshop (Odyssey 2006), Puerto Rico, 2006, pp. 1-6.
[Bibtex]
@conference{elha06:odis,
author = "Asmaa El Hannani and Doroteo Toledano and Dijana Petrovska and Alberto Montero-Asenjo and Jean Hennebert",
abstract = "Recognition of speaker identity based on modeling the streams produced by phonetic decoders (phonetic speaker recognition) has gained popularity during the past few years. Two of the major problems that arise when phone based systems are being developed are the possible mismatches between the development and evaluation data and the lack of transcribed databases. Data-driven segmentation techniques provide a potential solution to these problems because they do not use transcribed data and can easily be applied on development data minimizing the mismatches. In this paper we compare speaker recognition results using phonetic and data-driven decoders. To this end, we have compared the results obtained with a speaker recognition system based on data-driven acoustic units and phonetic speaker recognition systems trained on Spanish and English data. Results obtained on the NIST 2005 Speaker Recognition Evaluation data show that the data-driven approach outperforms the phonetic one and that further improvements can be achieved by combining both approaches.",
booktitle = "IEEE Speaker and Language Recognition Workshop (Odyssey 2006), Puerto Rico",
crossref = " ",
doi = "10.1109/ODYSSEY.2006.248134",
editor = " ",
isbn = "142440472X",
keywords = "Biometrics; Speaker Verification",
month = " June",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = " 1-6",
publisher = " ",
series = " ",
title = "{U}sing {D}ata-driven and {P}honetic {U}nits for {S}peaker {V}erification",
volume = " ",
year = "2006",
}
• J. Hennebert, A. Humm, and R. Ingold, "Vérification d'Identité par Ecriture et Parole Combinées," in Colloque International Francophone sur l'Ecrit et le Document, Fribourg, Suisse (CIFED 2006), 2006.
[Bibtex]
@conference{henn06:cifed,
author = "Jean Hennebert and Andreas Humm and Rolf Ingold",
abstract = "Nous rapportons les premiers d{\'e}veloppements d'un syst{\e}me de v{\'e}rification d'identit{\'e} par utilisation combin{\'e}e de l'{\'e}criture et de la parole. La nouveaut{\'e} de notre approche r{\'e}side dans l'enregistrement simultan{\'e} de ces deux modalit{\'e}s en demandant {\a} l'utilisateur d'{\'e}noncer ce qu'il est en train d'{\'e}crire. Nous pr{\'e}sentons et analysons deux sc{\'e}narii: la signature lue o{\u} l'utilisateur {\'e}nonce le contenu de sa signature et l'{\'e}criture lue. Nous d{\'e}crivons le syst{\e}me d'acquisition, l'enregistrement d'une base de donn{\'e}es d'{\'e}valuation, les r{\'e}sultats d'une enqu{\^e}te d'acceptabilit{\'e}, le syst{\e}me de v{\'e}rification {\a} base de multi-gaussiennes et les r{\'e}sultats de ce dernier obtenus pour le sc{\'e}nario signature.",
booktitle = "Colloque International Francophone sur l'Ecrit et le Document, Fribourg, Suisse (CIFED 2006)",
crossref = " ",
editor = " ",
keywords = "Biometrics; Signature; Speech; Handwriting",
month = "Septembre",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = " ",
publisher = " ",
series = " ",
title = "{V}{\'e}rification d'{I}dentit{\'e} par {E}criture et {P}arole {C}ombin{\'e}es",
volume = " ",
year = "2006",
}
• J. Hennebert, A. Wahl, and A. Humm, Video of Sign4J, a Novel Tool to Generate Brute-Force Signature Forgeries, 2006.
[Bibtex]
@misc{henn06:sign,
author = "Jean Hennebert and Alain Wahl and Andreas Humm",
abstract = "In this video, we present a procedure to create brute-force signature forgeries using Sign4J, a dynamic signature imitation training software that was specifically built to help people learn to imitate the dynamics of signatures. The main novelty of the procedure lies in a feedback mechanism that is provided to let the user know how good the imitation is and on what part of the signature the user has still to improve. A scientific publication has been done to describe the procedure implemented in the Sign4J software: A. Wahl, J. Hennebert, A. Humm and R. Ingold. "Generation and Evaluation of Brute-Force Signature Forgeries". International Workshop on Multimedia Content Representation, Classification and Security (MRCS'06), Istanbul, Turkey. 2006. pp. 2-9. In this publication, we report about a large scale test done on the MCYT-100 database. The procedure and the software are used to generate a set of brute-force signatures on the MCYT-100 database. This set of forged signatures is used to evaluate the rejection performance of a baseline dynamic signature verification system. As expected, the brute-force forgeries generate more false acceptation in comparison to the random and low-force forgeries available in the MCYT-100 database.",
keywords = "biometrics; Signature; Forgeries",
month = "September",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
title = "{V}ideo of {S}ign4{J}, a {N}ovel {T}ool to {G}enerate {B}rute-{F}orce {S}ignature {F}orgeries",
year = "2006",
}
• A. Humm, J. Hennebert, and R. Ingold, "Scenario and Survey of Combined Handwriting and Speech Modalities for User Authentication," in 6th International Conference on Recent Advances in Soft Computing (RASC 2006), Canterburry, Kent, UK, 2006, pp. 496-501.
[Bibtex]
@conference{humm06:rasc,
author = "Andreas Humm and Jean Hennebert and Rolf Ingold",
abstract = "We report on our developments towards building a novel user authentication system using combined handwriting and speech modalities. In our project, these modalities are simul- taneously recorded by asking the user to utter what he is writing. We introduce two potential scenarios that we have identified as candidates for applications and we describe the database recorded according to these scenarios. We then report on a usability survey that we have con- ducted while recording the database. Finally, we present preliminary performance results obtained on the database using one of the scenario.",
booktitle = "6th International Conference on Recent Advances in Soft Computing (RASC 2006), Canterburry, Kent, UK",
crossref = " ",
editor = " ",
keywords = "Biometrics; Signature; Speech; Handwriting",
month = " ",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "496-501",
publisher = " ",
series = " ",
title = "{S}cenario and {S}urvey of {C}ombined {H}andwriting and {S}peech {M}odalities for {U}ser {A}uthentication",
volume = " ",
year = "2006",
}
• A. Humm, J. Hennebert, and R. Ingold, "Combined Handwriting and Speech Modalities for User Authentication," University of Fribourg, Department of Informatics, 270-06-05, 2006.
[Bibtex]
@techreport{humm06:tr270,
author = "Andreas Humm and Jean Hennebert and Rolf Ingold",
abstract = "We report on our first developments towards building a novel user authentication system using combined handwriting and speech modalities. In our project, these modalities are simultaneously recorded by asking the user to utter what he is writing. We first report on a database that we have recorded according to this scenario. Then, we report on the results of a usability survey that we have conducted while recording the database. Finally, we present the assessment protocols for authentication systems defined on the database.",
institution = "University of Fribourg, Department of Informatics",
keywords = "Biometrics; Signature; Speech; Handwriting; Multimodal",
month = " ",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = "270-06-05",
title = "{C}ombined {H}andwriting and {S}peech {M}odalities for {U}ser {A}uthentication",
type = " ",
year = "2006",
}
• A. Humm, J. Hennebert, and R. Ingold, "Gaussian Mixture Models for CHASM Signature Verification," in 3rd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI 06), Washington, USA, 2006, pp. 102-113.
[Bibtex]
@conference{humm06:mlmi,
author = "Andreas Humm and Jean Hennebert and Rolf Ingold",
abstract = "In this paper we report on first experimental results of a novel multimodal user authentication system based on a combined acquisition of online handwritten signature and speech modalities. In our project, the so-called CHASM signatures are recorded by asking the user to utter what he is writing. CHASM actually stands for Combined Handwriting and Speech Modalities where the pen and voice signals are simultaneously recorded. We have built a baseline CHASM signature verification system for which we have conducted a complete experimental evaluation. This baseline system is composed of two Gaussian Mixture Models sub-systems that model independently the pen and voice signal. A simple fusion of both sub-systems is performed at the score level. The evaluation of the verification system is conducted on CHASM signatures taken from the MyIDea multimodal database, accordingly to the protocols provided with the database. This allows us to draw our first conclusions in regards to time variability impact, to skilled versus unskilled forgeries attacks and to some training parameters. Results are also reported for the two sub-systems evaluated separately and for the global system.",
booktitle = "3rd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI 06), Washington, USA",
crossref = " ",
editor = "Steve Renals; Samy Bengio; JonathanFiskus",
isbn = "9783540692676",
keywords = "Biometrics; Signature; Speech; Handwriting",
month = " May",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = " 102-113",
publisher = "Springer Verlag",
series = " Lecture Notes in Computer Science",
title = "{G}aussian {M}ixture {M}odels for {CHASM} {S}ignature {V}erification",
volume = " 4299",
year = "2006",
}
• A. Wahl, J. Hennebert, A. Humm, and R. Ingold, "Generation and Evaluation of Brute-Force Signature Forgeries," in International Workshop on Multimedia Content Representation, Classification and Security (MRCS'06), Istanbul, Turkey, 2006, pp. 2-9.
[Bibtex]
@conference{wahl06:mrcs,
author = "Alain Wahl and Jean Hennebert and Andreas Humm and Rolf Ingold",
abstract = "We present a procedure to create brute-force signature forgeries. The procedure is supported by Sign4J, a dynamic signature imitation training software that was specifically built to help people learn to imitate the dynamics of signatures. The main novelty of the procedure lies in a feedback mechanism that is provided to let the user know how good the imitation is and on what part of the signature the user has still to improve. The procedure and the software are used to generate a set of brute-force signatures on the MCYT-100 database. This set of forged signatures is used to evaluate the rejection performance of a baseline dynamic signature verification system. As expected, the brute-force forgeries generate more false acceptation in comparison to the random and low-force forgeries available in the MCYT-100 database.",
booktitle = "International Workshop on Multimedia Content Representation, Classification and Security (MRCS'06), Istanbul, Turkey",
crossref = " ",
editor = " ",
isbn = "9783540393924",
keywords = "Biometrics; Signature; Forgeries",
month = " September",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "2-9",
publisher = " ",
series = " Lecture Notes in Computer Science",
title = "{G}eneration and {E}valuation of {B}rute-{F}orce {S}ignature {F}orgeries",
volume = "4105",
year = "2006",
}
• A. Wahl, J. Hennebert, A. Humm, and R. Ingold, "A novel method to generate Brute-Force Signature Forgeries," University of Fribourg, Department of Informatics, 274-06-09, 2006.
[Bibtex]
@techreport{wahl06:tr274,
author = "Alain Wahl and Jean Hennebert and Andreas Humm and Rolf Ingold",
abstract = "We present a procedure to create brute-force signature forgeries. The procedure is supported by Sign4J, a dynamic signature imitation training software that was specifically built to help people learn to imitate the dynamics of signatures. The main novelty of the procedure lies in a feedback mechanism that is provided to let the user know how good the imitation is and on what part of the signature the user has still to improve. The procedure and the software are used to generate a set of brute-force signatures on the MCYT-100 database. This set of forged signatures is used to evaluate the rejection performance of a baseline dynamic signature verification system. As expected, the brute-force forgeries generate more false acceptation in comparison to the random and low-force forgeries available in the MCYT-100 database.",
institution = "University of Fribourg, Department of Informatics",
keywords = "Biometrics; Signature; Forgeries",
month = " September",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = "274-06-09",
title = "{A} novel method to generate {B}rute-{F}orce {S}ignature {F}orgeries",
type = " ",
year = "2006",
}
• B. Dumas, C. Pugin, J. Hennebert, D. Petrovska, A. Humm, F. Evequoz, R. Ingold, and D. von Rotz, "MyIDea - Multimodal Biometrics Database, Description of Acquisition Protocols," in Biometrics on the Internet, 3rd COST 275 Workshop, Hatfield, UK, 2005, pp. 59-62.
[Bibtex]
@conference{duma05:cost,
author = "Bruno Dumas and Catherine Pugin and Jean Hennebert and Dijana Petrovska and Andreas Humm and Florian Evequoz and Rolf Ingold and von Rotz, Didier",
abstract = "This document describes the acquisition protocols of MyIDea, a new large and realistic multimodal biometric database designed to conduct research experiments in Identity Verification (IV). The key points of MyIDea are threefold: (1) it is strongly multimodal; (2) it implements realistic scenarios in an open-set framework; (3) it uses sensors of different quality to record most of the modalities. The combination of these three points makes MyIDea novel and pretty unique in comparison to existing databases. Furthermore, special care is put in the design of the acquisition procedures to allow MyIDea to complement existing databases such as BANCA, MCYT or BIOMET. MyIDea includes talking face, audio, fingerprints, signature, handwriting and hand geometry. MyIDea will be available early 2006 with an initial set of 104 subjects recorded over three sessions. Other recording sets will be potentially planned in 2006.",
booktitle = "Biometrics on the Internet, 3rd COST 275 Workshop, Hatfield, UK",
crossref = " ",
editor = " ",
keywords = "Biometrics; Database; Speech; Image; Fingerprint; Hand; Handwriting; Signature",
month = " ",
number = " ",
organization = " ",
pages = "59-62",
publisher = " ",
series = " ",
title = "{M}y{ID}ea - {M}ultimodal {B}iometrics {D}atabase, {D}escription of {A}cquisition {P}rotocols",
volume = " ",
year = "2005",
}
• J. Hennebert, A. Humm, B. Dumas, C. Pugin, and F. Evequoz, Web Site of MyIDea Multimodal Database, 2005.
[Bibtex]
@misc{henn05:myid,
author = "Jean Hennebert and Andreas Humm and Bruno Dumas and Catherine Pugin and Florian Evequoz",
abstract = "In the framework of the Swiss National Center of Competence in Research (NCCR) on Interactive Multimodal Information Management IM2 and of the european IST BioSecure project, the DIVA group of the informatics department of the university of Fribourg - DIUF - has recorded a multimodal biometric database called MyIDea. The recorded data that will be made available to institutes for research purposes.
The acquisition campaign started end of 2004 and finished in December 2005. The database is now in its validation phase. Some data sets are already available for distribution (please contact us to check planned dates for availabilities or fill-in this inline formular to express your interest in the data).",
howpublished = "http://diuf.unifr.ch/go/myidea",
keywords = "Biometrics; Database; Speech; Image; Fingerprint; Hand; Handwriting",
month = " ",
title = "{W}eb {S}ite of {M}y{ID}ea {M}ultimodal {D}atabase",
url = "http://diuf.unifr.ch/go/myidea",
year = "2005",
}
• B. Dumas, J. Hennebert, A. Humm, R. Ingold, D. Petrovska, C. Pugin, and D. von Rotz, "MyIdea - Sensors Specifications and Acquisition Protocol," University of Fribourg, Department of Informatics, 256-05-12, 2005.
[Bibtex]
@techreport{henn05:tr256,
author = "Bruno Dumas and Jean Hennebert and Andreas Humm and Rolf Ingold and Dijana Petrovska and Catherine Pugin and von Rotz, Didier",
abstract = "In this document we describe the sensor specifications and acquisition protocol of MyIdea, a new large and realistic multi-modal biometric database designed to conduct research experiments in Identity Verification.",
institution = "University of Fribourg, Department of Informatics",
keywords = "Biometrics, Database, Speech, Image, Fingerprint, Hand, Handwriting",
month = " June",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = "256-05-12",
title = "{M}y{I}dea - {S}ensors {S}pecifications and {A}cquisition {P}rotocol",
type = " ",
year = "2005",
}
• F. Simillion, J. Hennebert, and M. Wentland, "From Prediction to Classification : The Application of Pattern Recognition Theory to Stock Price Movements Analysis," in Second Congrès International de Gestion et d'Economie Floue (2nd CIGEF), 1995, pp. 1-15.
[Bibtex]
@conference{simi95:sigef,
author = "Fabian Simillion and Jean Hennebert and Maria Wentland",
abstract = "The limited success of most prediction systems has proved that future stock prices are very difficult to predict. The purpose of this paper is to show that future prices do not have to be known to make successful investments and that anticipating movements (increases or decreases) of the price can be sufficient. Probabilistic classification systems based on pattern recognition theory appear to be a good way to reach this objective. Moreover, they include some other advantages, principally in terms of risk management. Results show satisfactory classification hit rates but a rather poor translation into financial gains. This paper tries to identify causes of this problem and proposes some ideas of solution. ",
booktitle = "Second Congr{\e}s International de Gestion et d'Economie Floue (2nd CIGEF)",
crossref = " ",
editor = " ",
keywords = "MLP, Parzen, Financial Prediction, Pattern Matching, Machine Learning",
month = " ",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "1-15",
publisher = " ",
series = " ",
title = "{F}rom {P}rediction to {C}lassification : {T}he {A}pplication of {P}attern {R}ecognition {T}heory to {S}tock {P}rice {M}ovements {A}nalysis",
volume = " ",
year = "1995",
}
• J. Hennebert, E. Mosanya, G. Zanellato, F. Hambye, and U. Mosanya, EPO Patent pending: Speech Recognition Device, 2003.
[Bibtex]
@misc{henn03:epo,
author = "Jean Hennebert and Emeka Mosanya and Georges Zanellato and Fr{\'e}d{\'e}ric Hambye and Ugo Mosanya",
abstract = "A speech recognition device having a hidden operator communication unit and being connectable to a voice communication system having a user communication unit, said speech recognition device comprising a processing unit and a memory provided for storing speech recognition data comprising command models and at least one threshold value (T) said processing unit being provided for processing speech data, received from said voice communication system, by scoring said command models against said speech data in order to determine at least one recognition hypothesis (O), said processing unit being further provided for determining a confidence score (S) on the basis of said recognition hypothesis and for weighing said confidence score against said threshold values in order to accept or reject said received speech data, said device further comprises forwarding means provided for forwarding said speech data to said hidden operator communication unit in response to said rejection of received speech data, said hidden operator communication unit being provided for generating upon receipt of said rejection a recognition string based on said received speech data, said hidden operator communication unit being further provided for generating a target hypothesis (Ot) on the basis of said recognition string generated by said hidden operator communication unit, said device further comprising evaluation means provided for evaluating said target hypothesis with respect to said determined recognition hypothesis and for adapting said stored command models and/or threshold values on the basis of results obtained by said evaluation.",
howpublished = " EPO EP1378886 (A1) ― 2004-01-07",
keywords = "Speech Processing, Speech Recognition",
month = " ",
title = "{EPO} {P}atent pending: {S}peech {R}ecognition {D}evice",
year = "2003",
}
• C. Fredouille, J. Mariethoz, C. Jaboulet, J. Hennebert, C. Mokbel, and F. Bimbot, "Behavior of a Bayesian adaptation method for incremental enrollment in speaker verification," in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2000), Istanbul, Turkey, 2000, pp. 1197-1200.
[Bibtex]
@conference{fred00:icassp,
author = "Corinne Fredouille and Johnny Mariethoz and C{\'e}dric Jaboulet and Jean Hennebert and Chafik Mokbel and Fr{\'e}d{\'e}ric Bimbot",
abstract = "Classical adaptation approaches are generally used for speaker or environment adaptation of speech recognition systems. In this paper, we use such techniques for the incremental training of client models in a speaker verification system. The initial model is trained on a very limited amount of data and then progressively updated with access data, using a segmental-EM procedure. In supervised mode (i.e. when access utterances are certified), the incremental approach yields equivalent performance to the batch one. We also investigate on the impact of various scenarios of impostor attacks during the incremental enrollment phase. All results are obtained with the Picassoft platform-the state-of-the-art speaker verification system developed in the PICASSO project",
booktitle = "IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2000), Istanbul, Turkey",
crossref = " ",
doi = "10.1109/ICASSP.2000.859180",
editor = " ",
isbn = "0780362934",
keywords = "Speaker Verification; Speech Processing; Bayesian Adaptation",
month = " ",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "1197-1200",
publisher = " ",
series = " ",
title = "{B}ehavior of a {B}ayesian adaptation method for incremental enrollment in speaker verification",
volume = "2",
year = "2000",
}
• C. Fredouille, J. Mariethoz, C. Jaboulet, J. Hennebert, C. Mokbel, and F. Bimbot, "Behavior of a Bayesian adaptation method for incremental enrollment in speaker verification - Technical Report," IDIAP, 02, 2000.
[Bibtex]
@techreport{fred00:idiap,
author = "Corinne Fredouille and Johnny Mariethoz and C{\'e}dric Jaboulet and Jean Hennebert and Chafik Mokbel and Fr{\'e}d{\'e}ric Bimbot",
abstract = "Classical adaptation approaches are generally used for speaker or environment adaptation of speech recognition systems. In this paper, we use such techniques for the incremental training of client models in a speaker verification system. The initial model is trained on a very limited amount of data and then progressively updated with access data, using a segmental-EM procedure. In supervised mode (i.e. when access utterances are certified), the incremental approach yields equivalent performance to the batch one. We also investigate on the impact of various scenarios of impostor attacks during the incremental enrollment phase. All results are obtained with the Picassoft platform - the state-of-the-art speaker verification system developed in the PICASSO project.",
institution = "IDIAP",
keywords = "Speaker Verification; Speech Processing; Bayesian Adaptation",
month = " January",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = "02",
title = "{B}ehavior of a {B}ayesian adaptation method for incremental enrollment in speaker verification - {T}echnical {R}eport",
type = " ",
year = "2000",
}
• D. Petrovska, J. Cernocky, J. Hennebert, and G. Chollet, "Segmental Approaches for Automatic Speaker Verification," Digital Signal Processing Journal, vol. 10, iss. 1-3, pp. 198-212, 2000.
[Bibtex]
@article{petr00:dsp,
author = "Dijana Petrovska and Jan Cernocky and Jean Hennebert and G{\'e}rard Chollet",
abstract = "Speech is composed of different sounds (acoustic segments). Speakers differ in their pronunciation of these sounds. The segmental approaches described in this paper are meant to exploit these differences for speaker verification purposes. For such approaches, the speech is divided into different classes, and the speaker modeling is done for each class. The speech segmentation applied is based on automatic language independent speech processing tools that provide a segmentation of the speech requiring neither phonetic nor orthographic transcriptions of the speech data. Two different speaker modeling approaches, based on multilayer perceptrons (MLPs) and on Gaussian mixture models (GMMs), are studied. The MLP-based segmental systems have performance comparable to that of the global MLP-based systems, and in the mismatched train-test conditions slightly better results are obtained with the segmental MLP system. The segmental GMM systems gave poorer results than the equivalent global GMM systems.",
crossref = " ",
doi = "10.1006/dspr.2000.0370",
issn = "1051-2004",
journal = "Digital Signal Processing Journal",
keywords = "Speaker Verification; Speech Processing",
month = "January",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = "1-3",
pages = "198-212",
title = "{S}egmental {A}pproaches for {A}utomatic {S}peaker {V}erification",
volume = " 10",
year = "2000",
}
• J. Hennebert, H. Melin, D. Petrovska, and D. Genoud, "POLYCOST: A telephone-speech database for speaker recognition," Speech Communication, vol. 31, iss. 2-3, pp. 265-270, 2000.
[Bibtex]
@article{henn00:spec,
author = "Jean Hennebert and Hakan Melin and Dijana Petrovska and Dominique Genoud",
abstract = "This article presents an overview of the POLYCOST database dedicated to speaker recognition applications over the telephone network. The main characteristics of this database are: medium mixed speech corpus size (>100 speakers), English spoken by foreigners, mainly digits with some free speech, collected through international telephone lines, and minimum of nine sessions for 85\% of the speakers. Cet article pr{\'e}sente une description de la base de donn{\'e}es POLYCOST qui est d{\'e}di{\'e}e aux applications de reconnaissance du locuteur {\a} travers les lignes t{\'e}l{\'e}phoniques. Les caract{\'e}ristiques de la base de donn{\'e}es sont: corpus moyen {\a} contenu vari{\'e} (>100 locuteurs), anglais parl{\'e} par des {\'e}trangers, chiffres lus et parole libre, enregistrement {\a} travers des lignes de t{\'e}l{\'e}phone internationales, minimum de neuf sessions d'enregistrement pour 85% des locuteurs.",
crossref = " ",
doi = "10.1016/S0167-6393(99)00082-5",
issn = "0167-6393",
journal = "Speech Communication",
keywords = "Speaker Verification; Database; Speech Processing",
month = "June",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = "2-3",
pages = "265-270",
title = "{POLYCOST}: {A} telephone-speech database for speaker recognition",
volume = "31",
year = "2000",
}
• B. Nedic, F. Bimbot, R. Blouet, J. Bonastre, G. Caloz, J. Cernocky, G. Chollet, G. Durou, C. Fredouille, D. Genoud, G. Gravier, J. Hennebert, J. Kharroubi, I. Magrin-Chagnolleau, T. Merlin, C. Mokbel, D. Petrovska, S. Pigeon, M. Seck, P. Verlinde, and M. Zouhal, "The ELISA Systems for the NIST'99 Evaluation in Speaker Detection and Tracking," Digital Signal Processing Journal, vol. 10, iss. 1-3, pp. 143-153, 2000.
[Bibtex]
@article{nedic00:dsp,
author = "Bojan Nedic and Fr{\'e}d{\'e}ric Bimbot and Rapha{\"e}l Blouet and Jean-Fran{\c{c}}ois Bonastre and Gilles Caloz and Jan Cernocky and G{\'e}rard Chollet and Geoffrey Durou and Corinne Fredouille and Dominique Genoud and Guillaume Gravier and Jean Hennebert and Jamal Kharroubi and Ivan Magrin-Chagnolleau and Teva Merlin and Chafik Mokbel and Dijana Petrovska and St{\'e}phane Pigeon and Mouhamadou Seck and Patrick Verlinde and Meriem Zouhal",
abstract = "This article presents the text-independent speaker detection and tracking systems developed by the members of the ELISA Consortium for the NIST'99 speaker recognition evaluation campaign. ELISA is a consortium grouping researchers of several laboratories sharing software modules, resources and experimental protocols. Each system is briefly described, and comparative results on the NIST'99 evaluation tasks are discussed.",
crossref = " ",
doi = "10.1006/dspr.1999.0365",
issn = "1051-2004",
journal = "Digital Signal Processing Journal",
keywords = "text-independent; speaker verification; speaker detection; speaker tracking; NIST evaluation campaign",
month = "January",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = "1-3",
pages = "143-153",
title = "{T}he {ELISA} {S}ystems for the {NIST}'99 {E}valuation in {S}peaker {D}etection and {T}racking",
volume = " 10",
year = "2000",
}
• J. Cernocky, G. Baudoin, D. Petrovska, J. Hennebert, and G. Chollet, "Automatically derived speech units: applications to very low rate coding and speaker verification," in First Workshop on Text Speech and Dialog (TSD'98), Brno, Czech Republic, 1998, pp. 183-188.
[Bibtex]
@conference{cern98:tsd,
author = "Jan Cernocky and Genevi{\e}ve Baudoin and Dijana Petrovska and Jean Hennebert and G{\'e}rard Chollet",
abstract = "Current systems for recognition, synthesis, very low bit-rate (VLBR) coding and text-independent speaker verification rely on sub-word units determined using phonetic knowledge. This paper presents an alternative to this approach determination of speech units using ALISP (Automatic Language Independent Speech Processing) tools. Experimental results for speaker-dependent VLBR coding are reported on two databases: average rate of 120 bps for unit encoding was achieved. In verification, this approach was tested during 1998's NIST-NSA evaluation campaign with a MLP-based scoring system.",
booktitle = "First Workshop on Text Speech and Dialog (TSD'98), Brno, Czech Republic",
crossref = " ",
editor = " ",
isbn = "8021018992",
keywords = "Speaker Verification; Speech Processing",
month = " ",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "183-188",
publisher = " ",
series = " ",
title = "{A}utomatically derived speech units: applications to very low rate coding and speaker verification",
volume = " ",
year = "1998",
}
• G. Chollet, J. Cernocky, G. Gravier, J. Hennebert, D. Petrovska, and F. Yvon, "Towards fully automatic speech processing techniques for interactive voice servers," in Speech Processing, Recognition and Artificial Neural Networks: Proceedings of the 3rd International School on Neural Nets, Eduardo Caianiello, A. E. M. M. Gerard Chollet Gabriella M. Di Benedetto, Ed., Springer Verlag, 1999, p. 346.
[Bibtex]
@incollection{chol98:towards,
author = "G{\'e}rard Chollet and Jan Cernocky and Guillaume Gravier and Jean Hennebert and Dijana Petrovska and Fran{\c{c}}ois Yvon",
abstract = "Speech Processing, Recognition and Artificial Neural Networks contains papers from leading researchers and selected students, discussing the experiments, theories and perspectives of acoustic phonetics as well as the latest techniques in the field of spe ech science and technology. Topics covered in this book include; Fundamentals of Speech Analysis and Perceptron; Speech Processing; Stochastic Models for Speech; Auditory and Neural Network Models for Speech; Task-Oriented Applications of Automatic Speech Recognition and Synthesis.",
booktitle = "Speech Processing, Recognition and Artificial Neural Networks: Proceedings of the 3rd International School on Neural Nets, Eduardo Caianiello",
chapter = " ",
edition = " ",
editor = "Gerard Chollet, Gabriella M. Di Benedetto, Anna Esposito, Maria Marinaro",
isbn = "1852330945",
keywords = "Speech Processing, Speech Recognition",
month = "April ",
note = "PDF may not be the final published version. Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
pages = "346",
publisher = "Springer Verlag",
title = "{T}owards fully automatic speech processing techniques for interactive voice servers",
type = " ",
volume = " ",
year = "1999",
}
• J. Hennebert, "Hidden Markov models and artificial neural networks for speech and speaker recognition," PhD Thesis PhD Thesis, Lausanne, 1998.
[Bibtex]
@phdthesis{henn98:phd,
author = "Jean Hennebert",
abstract = "In this thesis, we are concerned with the two fields of automatic speech recognition (ASR) and automatic speaker recognition (ASkR) in telephony. More precisely, we are interested in systems based on hidden Markov models (HMMs) in which artificial neural networks (ANNs) are used in place of more classical tools. This work is dedicated to the analysis of three approaches. The first one, mainly original, concerns the use of Self-Organizing Maps in discrete HMMs for isolated word speech recognition. The second approach concerns continuous hybrid HMM/ANN systems, extensively studied in previous research work. The system is not original in its form but its analysis permitted to bring a new theoretical framework and to introduce some extensions regarding the way the system is trained. The last part concerns the implementation of a new ANN segmental approach for text-independent speaker verification.",
doi = "10.5075/epfl-thesis-1860",
keywords = "ANN, HMM, Artificial Neural Networks, Hidden Markov Models, Speech Recognition",
month = "October",
note = "http://library.epfl.ch/theses/?nr=1860
Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
publisher = "EPFL",
school = "EPFL",
title = "{H}idden {M}arkov models and artificial neural networks for speech and speaker recognition",
type = "PhD Thesis",
year = "1998",
}
• J. Hennebert and D. Petrovska, "Phoneme Based Text-Prompted Speaker Verification with Multi-Layer Perceptrons," in Speaker Recognition and its Commercial and Forensic Applications (RLA2C), Avignon, France, 1998, pp. 55-58.
[Bibtex]
@conference{henn98:rla2c,
author = "Jean Hennebert and Dijana Petrovska",
abstract = "Results presented in this paper are obtained in the framework of a text-prompted speaker verification system using Hidden Markov Models (HMMs) and Multi Layer Perceptrons (MLPs). The aims of the study described here are (1) to assess the relative speaker discriminant properties of phonemes with different temporal frame-to-frame context at the input of the MLP's and (2) to study the influence of two sampling techniques of the acoustic vectors while training the MLP's.",
booktitle = "Speaker Recognition and its Commercial and Forensic Applications (RLA2C), Avignon, France",
crossref = " ",
editor = " ",
keywords = "Speaker Verification; Speech Processing; MLP",
month = " ",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "55-58",
publisher = " ",
series = " ",
title = "{P}honeme {B}ased {T}ext-{P}rompted {S}peaker {V}erification with {M}ulti-{L}ayer {P}erceptrons",
volume = " ",
year = "1998",
}
• D. Petrovska, J. Hennebert, H. Melin, and D. Genoud, "POLYCOST : A Telephone-Speech Database for Speaker Recognition," in Speaker Recognition and its Commercial and Forensic Applications (RLA2C), Avignon, France, 1998, pp. 211-214.
[Bibtex]
@conference{petr98:rla2c,
author = "Dijana Petrovska and Jean Hennebert and Hakan Melin and Dominique Genoud",
abstract = "This article presents an overview of the POLYCOST data-base dedicated to speaker recognition applications over the telephone network. The main characteristics of this data-base are: large mixed speech corpus size ($>$ 100 speakers), English spoken by foreigners, mainly digits with some free speech, collected through international telephone lines, and more than eight sessions per speaker.",
booktitle = "Speaker Recognition and its Commercial and Forensic Applications (RLA2C), Avignon, France",
crossref = " ",
editor = " ",
keywords = "Speaker Verification; Speech Processing",
month = " ",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "211-214",
publisher = " ",
series = " ",
title = "{POLYCOST} : {A} {T}elephone-{S}peech {D}atabase for {S}peaker {R}ecognition",
volume = " ",
year = "1998",
}
• D. Petrovska, J. Hennebert, J. Cernocky, and G. Chollet, "Text-Independent Speaker Verification Using Automatically Labelled Acoustic Segments," in International Conference on Spoken Language Processing (ICSLP 98), Sidney, Australia, 1998, pp. 536-539.
[Bibtex]
@conference{petr98:icslp,
author = "Dijana Petrovska and Jean Hennebert and Jan Cernocky and G{\'e}rard Chollet",
abstract = "Most of text-independent speaker verification techniques are based on modelling the global probability distribution function (pdf) of speakers in the acoustic vector space. Our paper presents an alternative to this approach with a class-dependent verification system using automatically determined segmental units. Segments are found with temporal decomposition and labelled through unsupervised clustering. The core of the system is based on a set of multi-layer perceptrons (MLP) trained to discriminate between client and an independent set of world speakers. Each MLP is dedicated to work with data segments that were previously selected as belonging to a particular class. The last step of the system is a recombination of MLP scores to take the verification decision. Issues and potential advantages of the segmental approach are presented. Performances of global and segmental approaches are reported on the NIST'98 data (250 female and 250 male speakers), showing promising results for the proposed new segmental approach. Comparison with state of the art system, based on Gaussian Mixture Modelling is also included.",
booktitle = "International Conference on Spoken Language Processing (ICSLP 98), Sidney, Australia",
crossref = " ",
editor = " ",
keywords = "Speaker Verification; Speech Processing",
month = " ",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "536-539",
publisher = " ",
series = " ",
title = "{T}ext-{I}ndependent {S}peaker {V}erification {U}sing {A}utomatically {L}abelled {A}coustic {S}egments",
volume = " ",
year = "1998",
}
• D. Petrovska, J. Cernocky, J. Hennebert, and G. Chollet, "Text-Independent Speaker Verification Using Automatically Labelled Acoustic Segments," in Advances in Phonetics, Proc. of the International Phonetic Sciences conference, Western Washington Univ., Bellingham, 1998, pp. 129-136.
[Bibtex]
@conference{petr98:ips,
author = "Dijana Petrovska and Jan Cernocky and Jean Hennebert and G{\'e}rard Chollet",
abstract = "Most of text-independent speaker verification techniques are based on modelling the global probability distribution function (pdf) of speakers in the acoustic vector space. Our paper presents an alternative to this approach with a class-dependent verification system using automatically determined segmental units.",
booktitle = "Advances in Phonetics, Proc. of the International Phonetic Sciences conference, Western Washington Univ., Bellingham",
crossref = " ",
editor = " ",
keywords = "Speaker Verification; Speech Processing",
month = " ",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "129-136",
publisher = " ",
series = " ",
title = "{T}ext-{I}ndependent {S}peaker {V}erification {U}sing {A}utomatically {L}abelled {A}coustic {S}egments",
volume = " ",
year = "1998",
}
• D. Petrovska and J. Hennebert, "Text-Prompted Speaker Verification Experiments with Phoneme Specific MLP's," in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 98), Seattle, USA, 1998, pp. 777-780.
[Bibtex]
@conference{petr98:icassp,
author = "Dijana Petrovska and Jean Hennebert",
abstract = "The aims of the study described in this paper are (1) to assess the relative speaker discriminant properties of phonemes and (2) to investigate the importance of the temporal frame-to-frame information for speaker modelling in the framework of a text-prompted speaker verification system using Hidden Markov Models (HMMs) and Multi Layer Perceptrons (MLPs). It is khown that, with similar experimental conditions, nasals, fricatives and vowels convey more speaker specific informations than plosives and liquids. Regarding the influence of the frame-to-frame temporal information, significant improvements are reported from the inclusion of several acoustic frames at the input of the MLPs. Results tend also to show that each phoneme has its optimal MLP context size giving the best Equal Error Rate (EER).",
booktitle = "IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 98), Seattle, USA",
crossref = " ",
doi = "10.1109/ICASSP.1998.675380",
editor = " ",
isbn = "0780344286",
issn = "1520-6149",
keywords = "Speaker Verification; Speech Processing; MLP",
month = " ",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "777-780",
publisher = " ",
series = " ",
title = "{T}ext-{P}rompted {S}peaker {V}erification {E}xperiments with {P}honeme {S}pecific {MLP}'s",
volume = " ",
year = "1998",
}
• J. Hennebert, C. Ris, H. Bourlard, S. Renals, and N. Morgan, "Estimation of Global Posteriors and Forward-Backward Training of Hybrid HMM/ANN Systems," in European Conference on Speech Communication and Technology (EUROSPEECH 97), Rhodes, Greece, 1997, pp. 1951-1954.
[Bibtex]
@conference{henn97:euro,
author = "Jean Hennebert and Christophe Ris and Herv{\'e} Bourlard and Steve Renals and Nelson Morgan",
abstract = "The results of our research presented in this paper is two-fold. First, an estimation of global posteriors is formalized in the framework of hybrid HMM/ANN systems. It is shown that hybrid HMM/ANN systems, in which the ANN part estimates local posteriors, can be used to modelize global model posteriors. This formalization provides us with a clear theory in which both REMAP and classical'' Viterbi trained hybrid systems are unified. Second, a new forward-backward training of hybrid HMM/ANN systems is derived from the previous formulation. Comparisons of performance between Viterbi and forward-backward hybrid systems are presented and discussed.",
booktitle = "European Conference on Speech Communication and Technology (EUROSPEECH 97), Rhodes, Greece",
crossref = " ",
editor = " ",
month = " ",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "1951-1954",
publisher = " ",
series = " ",
title = "{E}stimation of {G}lobal {P}osteriors and {F}orward-{B}ackward {T}raining of {H}ybrid {HMM}/{ANN} {S}ystems",
volume = " ",
year = "1997",
}
• J. Hennebert and D. Petrovska, "POST: Parallel Object-Oriented Speech Toolkit," in International Conference on Spoken Language Processing (ICSLP 96), Philadelphia, USA, 1996, pp. 1966-1969.
[Bibtex]
@conference{henn96:icslp,
author = "Jean Hennebert and Dijana Petrovska",
abstract = "We give a short overview of POST, a parallel speech toolkit that is distributed freeware to academic institutions. The underlying idea of POST is that large computational problems, like the ones involved in Automatic Speech Recognition (ASR), can be solved more cost effectively by using the aggregate power and memory of many computers. In its current version (January 96) and amongst other things, POST can perform simple feature extraction, training and testing of word and subword Hidden Markov Models (HMMs) with discrete and multigaussian statistical modelling. In this parer, the implementation of the parallelism is discussed and an evaluation of the performances on a telephone database is presented. A short introduction to Parallel Virtual Machine (PVM), the library through which the parallelism is achieved, is also given.",
booktitle = "International Conference on Spoken Language Processing (ICSLP 96), Philadelphia, USA",
crossref = " ",
editor = " ",
keywords = "ASR; Speech Recognition; Toolkit; Parallelism",
month = " ",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "1966-1969",
publisher = " ",
series = " ",
title = "{POST}: {P}arallel {O}bject-{O}riented {S}peech {T}oolkit",
volume = " ",
year = "1996",
}
• D. Petrovska, J. Hennebert, D. Genoud, and G. Chollet, "Semi-Automatic HMM-based annotation of the POLYCOST database," in COST 250 workshop on Application of Speaker Recognition Techniques in Telephony, Vigo, Spain, 1996, pp. 23-26.
[Bibtex]
@conference{petr96:cost,
author = "Dijana Petrovska and Jean Hennebert and Dominique Genoud and G{\'e}rard Chollet",
booktitle = "COST 250 workshop on Application of Speaker Recognition Techniques in Telephony, Vigo, Spain",
crossref = " ",
editor = " ",
month = " ",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "23-26",
publisher = " ",
series = " ",
title = "{S}emi-{A}utomatic {HMM}-based annotation of the {POLYCOST} database",
volume = " ",
year = "1996",
}
• V. Fontaine, J. Hennebert, and H. Leich, "Influence of Vector Quantization on Isolated Word Recognition," in European Signal Processing Conference (EUSIPCO 94), Edinburgh, UK, 1994, pp. 115-118.
[Bibtex]
@conference{font94:eusip,
author = "Vincent Fontaine and Jean Hennebert and Henri Leich",
abstract = "Vector Quantization can be considered as a data compression technique. In the last few years, vector quantization has been increasingly applied to reduce problem complexity like pattern recognition. In speech recognition, discrete systems are developed to build up real-time systems. This paper presents original results by comparing the K-Means and the Kohonen approaches on the same recognition platform. Influence of some quantization parameters is also investigated. It can be observed through the results presented in this paper that the quantization quality has a significant influence on the recognition rates. Surprisingly, the Kohonen approach leads to better recognition results despite its poor distortion performance.",
booktitle = "European Signal Processing Conference (EUSIPCO 94), Edinburgh, UK",
crossref = " ",
editor = " ",
isbn = "3200001658",
keywords = "Speech Recognition; ASR, Vector Quantization; HMM",
month = " ",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "115-118",
publisher = " SuviSoft Oy Ltd.",
series = " ",
title = "{I}nfluence of {V}ector {Q}uantization on {I}solated {W}ord {R}ecognition",
volume = " ",
year = "1994",
}
• J. Hennebert, M. Hasler, and H. Dedieu, "Neural Networks in Speech Recognition," in 6th Microcomputer School, invited paper, Prague, Czech Republic, 1994, pp. 23-40.
[Bibtex]
@conference{henn94:micro,
author = "Jean Hennebert and Martin Hasler and Herv{\'e} Dedieu",
abstract = "We review some of the Artificial Neural Network (ANN) approaches used in speech recognition. Some basic principles of neural networks are briefly described as well as their current applications and performances in speech recognition. Strenghtnesses and weaknesses of pure connectionnist networks in the particular context of the speech signal are then evoqued. The emphasis is put on the capabilities of connectionnist methods to improve the performances of the Hidden Markov Model approach (HMM). Some of the principles that govern the socalled hybrid HMM-ANN approach are then briefly explained. Some recent combinations of stochastic models and ANNs known as the Hidden Control Neural Networks are also presented.",
booktitle = "6th Microcomputer School, invited paper, Prague, Czech Republic",
crossref = " ",
editor = " ",
keywords = "ANN; Artificial Neural Networks; Speech Recognition; ASR",
month = " ",
note = "Some of the files below are copyrighted. They are provided for your convenience, yet you may download them only if you are entitled to do so by your arrangements with the various publishers.",
number = " ",
organization = " ",
pages = "23-40",
publisher = " ",
series = " ",
title = "{N}eural {N}etworks in {S}peech {R}ecognition",
}`