If you are not a registered member of our community, please click here to register...

 Home Message Boards Health Guide Join for Free Testimonials About Us
Search
   
  


PDA

View Full Version : Software improve deaf kids' listening and speech skills


 

 

 
HHIssues
03-06-2001, 10:18 AM
From Excite
http://news.excite.com/news/bw/010305/ca-sensory

ABC-TV's 'Prime Time' to Showcase Role of Sensory's Fluent Animated Speech Technology in Teaching Deaf Children to Speak; Software Transforms Diane Sawyer's Image & Voice into Animation Demo

Updated 10:16 AM ET March 5, 2001
SANTA CLARA, Calif. (BUSINESS WIRE) - At the Tucker-Maxon Oral School in Portland, Ore., deaf children ages 6 through 12 are improving their listening and speech-production skills with the help of Baldi, a dome-headed, talking, computer-generated face. Tucker-Maxon's unusual talking tutor -- along with the powerful software technology that combines 3D animation with speech recognition and audio-visual generation of speech -- will be showcased on the ABC-TV news program "Prime Time" on Thursday, March 8 (10 p.m. Eastern). To demonstrate the power and accuracy of the software as a teaching tool for the profoundly deaf, the voice and face of "Prime Time's" co-anchor Diane Sawyer will be converted to a so-called conversational agent.

The software that allows the animated face of Sawyer -- and Baldi -- to talk and be understood by Tucker-Maxon students is Sensory Inc.'s Fluent Animated Speech(TM) technology. Sensory, based in Santa Clara, is a leading provider of embedded speech technology.

Origins of the Software

Sensory's Fluent Animated Speech software had its beginnings through research and development efforts primarily at the Oregon Graduate Institute Center for Spoken Language Understanding and at the Perceptual Science Laboratory at University of California-Santa Cruz. In 1997, researchers from these two institutions left their academic posts to found Beaverton, Ore.-based Fluent Speech Technologies, which Sensory acquired last October.

Technology Behind the Tucker-Maxon Story

With Sensory's Fluent Animated Speech technology, programmers and non-programmers alike can control the facial expressions, emotional expressions and lip synchronization of an animated 3D agent or avatar. At Tucker-Maxon, for example, educators with minimal computer skills easily design programs that both speak and listen. The software incorporates the animated face, Baldi, whose articulators are aligned with the utterances produced in either synthesized or natural speech. The motion of Baldi's lips, eyes and facial expressions add meaning to the words "spoken" by the computer. Around a topic chosen by the teacher, Baldi can ask a question; the student will be prompted to respond. That response will determine the next turn of the dialogue.

"The ability to create realistic, talking characters is no longer of interest solely to professional animators or producers of motion pictures," said Todd Mozer, president and chief executive officer of Sensory. "Our Fluent Animated Speech technology will bring such capabilities within the reach of nearly everyone."

Applications in Education and Beyond

By achieving its unprecedented accuracy of speech and facial animation, Sensory's Fluent Animated Speech technology will enable animated characters to play roles in Internet-based commerce, entertainment and customer support as well as education. Possible applications include adding an animated agent to a text or voice message; automating an interactive web host or agent; adding personality and emotional expressions to a web character or message; and creating online games in which the players control the speech of the characters.

New Animation Technology Represents a Breakthrough

The Fluent Animated Speech technology employs a non-linear morphing technique that enables Sensory to take a few dozen static pictures and blend them to create a virtually unlimited assortment of expressions and articulations. The technology provides memorable, highly accurate real-time lip-synching, as well as the delivery of emotional content by a 3D animated agent, synchronized to a variety of speech and text sources. The 3D models can be created using off-the-shelf 3D graphics tools.

The speech output comes from Sensory's Fluent Speech(TM) Text-to-Speech engine, which can reside in either a client or server environment. The Fluent Speech Text-to-Speech engine is an LPC (linear predictive coding), diphone-based speech synthesizer capable of expanding or contracting pitch periods and changing speech rates to produce a variety of sounds. The LPC approach makes it possible for the Fluent Animated Speech technology to synthesize high-quality speech using very little computer memory.

The Fluent Animated Speech 3D animation comprises a general-purpose OpenGL- or Direct3D-based real-time 3D rendering engine and a viseme generation engine. (A viseme is the visual component of a phoneme, which is the smallest individual component of speech.) The viseme generation engine is a coarticulation package that generates weighted morphing data (in the form of visemes) that drives the animated speech from either synthetic or natural speech.

The coarticulation package is an important part of Sensory's special speech software code that enables animated characters to speak with realistic facial and mouth movements. In humans, coarticulation is the coordination by the brain of the lips, tongue and jaw to create the movements needed to produce adjacent vowels and consonants simultaneously during normal speech. Coarticulation ensures that speech is produced smoothly, and it spreads out acoustic information about a vowel or consonant to help a listener understand what is being said. With Sensory's coarticulation package, animated characters can communicate at five syllables per second - the same rate that humans produce speech.

The Fluent Speech Animation technology's 3D rendering engine allows the rendering of arbitrary 3D models and uses a morphing-based approach to animation. Exporters for 3D authoring tools enable 3D models to be saved in a compatible format. Additionally, the Fluent Animated Speech technology can take advantage of other vendors' existing tools for the scripting of speech and facial content and the automatic generation of expressions and facial gestures.

Users can control lighting and background images as well as the characters being animated, and AVI output is available. The Sensory technology comes with a selection of human and animal 3D models that include the mouth and facial targets required for animating (i.e., not every feature in a face or mouth needs to be animated - and thus modeled - for creating realistic speech). As a result, users can quickly create realistic animated characters, along with background environments.

Price, Availability and System Requirements

Sensory's Fluent Animated Speech technology is available now. For networked applications, typical pricing is based on an Application Service Provider (ASP) model with an annual per-port fee. For embedded applications, pricing is under $2 per unit in volume. The technology currently runs under Windows 95/98/2000/ME on a minimum 266 MHz Pentium II processor with at least 64 MB of RAM.

About Sensory, Inc.

Founded in 1994, Sensory, Inc., is the leading provider of high-quality, low-cost speech recognition and speech synthesis technology. Sensory's speech technology is embedded in consumer products such as personal electronics, Internet appliances, interactive toys, and high-end telephone and automotive applications. Sensory offers a complete line of integrated circuit (IC) and embedded software solutions, including the Interactive Speech(TM) line of low-cost ICs and the Fluent Speech(TM) large-vocabulary software engine. Sensory's customers include leading companies in the consumer electronics and embedded product markets, such as JVC, Hasbro, Mitsubishi, Mattel, Sega, Sharper Image, Fisher-Price, Sony, Tektronix, Toshiba, Uniden, VOS and Westclox. More information is
available from Sensory's web site at www.sensoryinc.com. (http://www.sensoryinc.com.)

Note to Editors: Interactive Speech, Fluent Speech and Fluent Animated Speech are trademarks of Sensory, Inc. All other trademarks are the property of their respective owners. Details about the Tucker-Maxon application is available at http://cslu.cse.ogi.edu/tm

Note: A Photo is available at URL:
http://www.businesswire.com/cgi-bin/photo.cgi?pw.030501/bb8

Contact: Sensory, Inc. Erik Soule, 408/240-1575 marcom@sensoryinc.com or
Martell Communications Lisa Figlioli, 203/625-0082 lfiglioli@martellpr.com

** HHIssues **

Sponsor
 







Site owned and operated by HealthBoards.com (TM)
Copyright and Terms of Use © 1998-2010 HealthBoards.com (TM) All rights reserved.
Do not copy or redistribute in any form!