Font
Large
Medium
Small
Night
Prev Index    Favorite Next

Chapter 14 Breakthrough Input Method Artifact

Duke entered a state of devilish research. His time is very precious now and there is no space to waste at all, so after eating, he quickly returned to his small home. Sitting in front of the computer that was downloaded 24 hours a day, he frantically searched and downloaded various voice clips and handed it over to Kerry for pronunciation semantic analysis and basic knowledge base construction.

Since moving to the rental house, Duke has crazily downloaded thousands of voice clips in various environments and contexts online, from TV and radio news clips to dialogues on film and television dramas, science and education commentary clips from the Animal World or National Geographic Channel, as well as various life scene clips, selfies, and pseudo-selfies. Thanks to Kuliu Potato, youtube, BT and electric donkey, let Duke know that the world has such rich and colorful sounds.

The many sound cables collected by Duke seem to be a drop in the ocean for Kerry's processing ability. Often, just after entering it, Kerry will calculate and parse the pronunciation semantic features of the fragment, thereby adding a new specimen element to the pronunciation semantic feature library for speech recognition. The different pronunciation segments are, the more valuable it is.

It’s like the more places a person lives, the more he can hear that the accents in different places are the same. Each sound is composed of some special and universal characteristics. The recognition rate of current speech recognition software for standard pronunciation is actually very good.

For example, ibm launched the voice recognition input system viavoice many years ago. The software recognition rate can reach a practical level in quiet environments and standard pronunciation.

Unfortunately, the actual application environment is not so ideal, but it is as different as the four people in Duke's dormitory. Although everyone speaks Chinese, the accents of the four people from different places are very different. When they first lived together, there were often some problems in communication, but everyone quickly got used to it.

The powerful learning ability of the human brain is definitely not comparable to that of today's computers. However, existing speech recognition software does not have such powerful learning adaptability, which means that it does not have a knowledge base for identifying such differences in speech characteristics, so of course it cannot well identify some unknown types of pronunciations.

The recognition of different accents and the elimination of environmental noise are two problems of speech recognition. To solve these problems, a large number of first-hand voice data fragments are required to establish a massive database of speech characteristics, or to develop a supercomputer as highly intelligent as Kerry.

Based on the theoretical data downloaded by Duke, Kerry has continuously updated the basic speech recognition algorithms and generated different speech recognition simulators based on the theoretical data downloaded by Duke and combined with various speech clip analysis. This is mainly considered that the computing level of the mainstream computers on Earth is too low-level than Kerry.

Taking the 50% computing power of simulating the iPhone4s as the minimum benchmark, Kerry simulated the accuracy and reaction time of the speech recognition algorithm under different performance conditions. The original version can achieve a recognition accuracy of 90% within 5 seconds from the initial benchmark performance - of course, this result is far beyond the level of all speech recognition software on earth today.

You should know that this 90% accuracy is simulated and identified and tested using thousands of Chinese and English voice information in different accents in different contexts, which means that filtering of various accents and noises is basically considered.

This score is much better than Apple Siri, who can only listen to English now. After all, Siri can recognize more standard English pronunciations. If you don’t believe it, try to use English recordings with Indian accents and Singaporean accents to see how many Siri can recognize.

If it is on a computer with a simulation performance of nearly 2g or more in dual-core main frequency, the recognition level of this indicator will be improved to reach an accuracy rate of more than 97% within 2 seconds. The reaction time is actually a bit conflicting with the recognition accuracy rate, because to identify more accurately, the original version of the speech corpus basic sources must be richer.

The wider the sound sampling, the higher the accuracy of recognition. The larger the voice sample library, the longer the time it is used for searching and matching will lead to the extension of the reaction time. Therefore, the sampling compression of voice samples and the speech search matching algorithm have always been the two key points of Kerry's optimization.

Kerry has been constantly simulating and improving the algorithm for extracting semantic eigenvalues ​​of speech. By constantly compressing redundant values, he continuously reduces the size of the speech sample corpus while maintaining no distortion. On the other hand, he has also continuously improved the intelligent search matching algorithm for the speech corpus.

Duke’s optimization algorithm is not helpful, but Duke has no problem collecting as many voice samples as possible. Therefore, Duke lives a very fulfilling life every day, searching and downloading different types of voice samples for Kerry to analyze and refine, and at the same time constantly learning and understanding these new processing algorithms created by Kerry, and knocking on the door of MIT.

Duke must have an innovative paper on basic speech recognition theory that reflects his abilities, but there is no ready-made speech recognition knowledge in Kerry's knowledge base. These are too old for Kerry, so old that even Leme has not added this knowledge to Kerry.

What Kerry is doing now is to use his powerful simulation capabilities to constantly simulate various speech processing algorithms based on the existing speech recognition theories and algorithms on earth.

It is a more effective method through simulation - although this method is a bit clumsy, it has Kerry's super computing power. After all, thousands of possible algorithms can be simulated every second, making this clumsy method have considerable effects. Several possible optimization algorithms were found, which raised the recognition rate and reaction time to a new level.

However, it is necessary to write these achievements in language and theories that people on earth can understand, and to make people understand them. Both Kerry and Duke are a new challenge, because Kerry is not a mechanical binary thinking model with 01 as the core, but a biological polymorphic thinking model.

Although Kerry has now been able to simulate more than ten common PC virtual machines on Earth at the same time at once. In order to enable Kerry to accurately understand the computing capabilities of computers on Earth, Duke bought four hosts with different interfaces and nearly twenty mainstream PC computers on the market to provide Kerry with performance benchmarks, and then Kerry performed virtual corresponding simulators based on the performance of these configurations.

However, since these special virtual machines do not need to be understood, Kerry can be created according to his own calculation method. Therefore, although the performance is comparable, the implementation patterns are very different. Compared with the CPUs with two different architectures of Risc and Cisc on Earth, the complexity is not the same order of magnitude.

Therefore, after Kerry completed the algorithm implemented in his own mode, he had to re-implement it according to the rules of 01 on Earth. This is indeed a huge challenge for Kerry, not to mention that the paper must be abstracted again on this basis. Not only should there be software implementation algorithms, but also to establish a mathematical model that can be proved based on the mathematics of the earth.

Therefore, Kerry kept running continuously for almost 24 hours. Finally, the simulation algorithm was able to achieve a recognition rate of 97% in 1 second on the lowest benchmark. After achieving a recognition rate of more than 99% in 1 second on a dual-core 2g main frequency computer, it took two more weeks.

After Duke finished reading more than a dozen mathematical monographs and downloading and studying several open source speech recognition software, Kerry completed the paper on a new speech recognition algorithm and assisted Duke in developing a speech recognition software running on a computer on Earth. The first application of this speech recognition software is to package it into a speech input method.

Cape Forum. Duke is now relaxing in completing the two tasks of speech recognition software and paper writing.

He registered and changed his vest and joined a discussion post on Kerry's war plot development. In order to test the new software, he saw him trying to imitate various accents and speak in different accents. These words were quickly identified by the computer and turned into text to reply to the analysis of characters and plots by various literary scholars in the forum.

Duke knows the plot well, and of course he is very clear when analyzing it. It is often a long and incisive analysis, which quickly attracted the attention of fans. Of course, with voice recognition input, although Duke's reply content is real, each reply is still faster than anyone in the forum.

I feel that even compared to professional shorthand staff.

"Hey, buddy, you are using the Shenma input method. Why did you reply so quickly? It almost instantly?" A literary youth finally couldn't stand Duke's curiosity of replies like flying, and couldn't help asking questions.

What input method? Duke was stunned and then realized that in order to test the speech recognition input method he had just developed, he did not pay attention to the speed control for a while. Unexpectedly, such a cool second relapse attracted attention without realizing it.

"A new type of voice input method." Duke said in a Tieling voice similar to Lao Zhao. He immediately converted his voice into text on the computer screen. There were many samples of Lao Zhao's voice, so the recognition rate was naturally not problem at all.

In the discussion just now, Duke has conducted simulation tests using all the pronunciation methods he can think of, with a 100% recognition accuracy. Although he is only turning the TV sound down as background noise, and is still a little far from a complex noise environment, Duke has changed his accents and tone and can reach this level. It can almost be seen that the era of keyboard input is over, and the launch of this voice input method will announce the beginning of a new input era.

"Hi, buddy, please make fun of me. I have used Penguin's voice input method, so you have the speed and accuracy." The literary youth replied in disbelief.

"Haha, the internal test version I just got, oh, the sala input method, if nothing unexpected happens, you will be able to download the preview version from major websites soon." Duke thought of Apple's Siri, and couldn't help but casually fabricate a similar software name and replied.

"Really or false? Which company has developed such a great input method?"

"This is the latest work developed by the company. It is under testing, haha, but it is really useful. It feels really good to get rid of the keyboard."

"It's charged or free? If it's free, can you send me your test version? My email is [email protected]"

"Brother, please send me one to [email protected]"

Soon the discussion of the post deviated from the direction, and more and more people began to pay attention to the conversation between these two people. In the end, they joined the industry of seeking sala input method. For a moment, the screen was filled with replies to seeking sala voice input method.

Duke, who created a sensational effect again, never expected that a software test would evolve like this. This shows that the scope of application of this voice input software is too wide. However, Duke agreed without a headache this time. Even if he lacks emotional intelligence, he knew that it would be absolutely inappropriate to send the software for free at this time. It can be seen that with the surge in IQ, especially after negotiations with the two editors, Duke's emotional intelligence still showed signs of progress.
Chapter completed!
Prev Index    Favorite Next