You are a black technology boss, you didn't say it sooner!

Chapter 56 Voice Dubbing

Although he didn't go to bed until three o'clock in the morning, Shao Yiming got up on time at eight o'clock the next morning, dressed himself like a dog, and arrived at the company by the nine o'clock line.

Shao Yiming worked in animation for three years and Internet operation for five years, and then resigned to start his own company.This company is very small, with only a dozen or so people so far. Its entrepreneurial direction is audio payment, which has gradually emerged in the past two years.

The company has finally reached the stage of A-round financing. As long as it gets the first round of financing, the startup will be initially successful.Now is the most critical moment, and there must be no mistakes.

The Venture Capital Conference runs from [-]:[-] am to [-]:[-] pm, and after [-]:[-] pm is the time for free interviews between investors and entrepreneurs.At ten o'clock, Shao Yiming brought several employees to the meeting site.

His speech was around 11:30, which happened to be the finale of the morning session.

Hearing his name, he straightened his collar and stepped onto the stage confidently.

"In recent years, with the rapid development of the Internet, information dissemination has changed from two levels to multi-levels, and the concept of decentralization has become more and more popular, which has led to the rise of UGC platforms, and UGC has become a model that everyone is chasing in the content field."

"What is UGC? User-created content. Yes, it has good diversity, entertainment and richness. All operators know that users are the best content exporters. But today, I want to give It throws cold water - the UGC model doesn't apply to audio."

"As of today, what are the UGC content in the audio field? Song covers, audio books, and others? None. These two are currently the most mature audio UGC. Let's take a look at its current status."

"Song cover, it is difficult to make profits through the song itself, it must be through other forms such as live broadcast, that is to say, it is not actually paid audio in essence, and it is not something that our industry should do. As for audiobooks, about user-made audiobooks, the infringement issue has not yet been identified, and it is completely impossible to make a profit.”

"Other audio content, such as audio programs, radio dramas, and original songs, are not content that users can produce. For amateur users, its threshold is too high. We see that there are now many radio drama lovers who spontaneously form groups Produce radio dramas, but this kind of amateur group can only produce one or two episodes of short episodes, and cannot maintain a long-term and stable update status. It is difficult to pay for this kind of content, and it is not a high-quality and sustainable paid content. "

"Everyone, UGC's roads in the audio field are all blocked. In our industry, there is no UGC payment."

Shao Yiming's speech was dissUGC at the beginning, which was completely opposite to the mainstream voice of "seeking a new UGC model" in the industry, and it attracted the interest of a large number of investors.

He felt that the heat was enough, so he changed the topic, stopped dissUGC, and began to talk about his company's entrepreneurial philosophy of "creating a high-quality PGC".

Shao Yiming has been in operation for five years and has participated in financing no less than three times. He knows how to attract investors.In addition to looking at the team and products, venture capitalists like to look at some mysterious entrepreneurial ideas. Even if the team is poor and the product has not yet formed, as long as the idea is in place, the money can be in place.

After Shao Yiming finished speaking, the audience applauded thunderously, and he walked down contentedly.

As soon as I got back to my seat, someone approached me, "Mr. Shao, the representative of our xx venture capital would like to make an appointment with you. Do you have time this afternoon?"

Shao Yiming responded to invitations one after another with a happy face, and the arrangement for the afternoon was from three o'clock to eight o'clock.

At the same time, in another building in the same city, employees of Echo Technology were busy.

Cheng Tao has been very proud recently. He was originally a project manager of Echo. When he signed the outsourcing contract with Lou Qingyan, he didn't think too much about it. He thought that speech synthesis should be suitable for the dubbing project of Echo. But he didn't expect it to be more than just for Echo. Dubbing, this technology can lift the entire echo to the sky.

Echo Technology's research and development direction is artificial intelligence + voice, and the voice synthesis software handed over by Lou Qingyan has reached its pinnacle in this field.

It can be said that with this technology in hand, all the research and development projects of the entire Echo are all abolished, and an overall update is necessary.

That day after seeing Lou Qingyan and going back, Cheng Tao, who was in shock for a long time and couldn't get back to his senses, made a phone call to the CEO of Echo, regardless of the time it was off work.

The president was naturally very angry when he was called after get off work, but under Cheng Tao's strong request, he still held back his temper and returned to the company.

But after seeing the speech synthesis software, all his temper disappeared, leaving only the unstoppable turbulent waves.

He was shocked for 10 minutes, then closed his eyes and thought for 10 minutes, and the first thing he did when he opened his eyes was to promote Cheng Tao from project manager to R&D director.

Then stopped all the research projects of Echo.

The call to stop overnight made the whole company panic, thinking that the company was going to close down.

The next day, those who experienced the baptism of the stormy sea became these R&D personnel. They were called together to observe the new speech synthesis technology, and then they were so shocked that they could not speak.

The research and development department of a technology company must be linked to the academic community. This group of people must read academic news every day and always pay attention to the progress of cutting-edge technology.It's not that they are ignorant, but no matter how authoritative academic journals are, they have never published such a shocking technology.

Yes... shocking.

How powerful is this technology?Even if they get it, they can only enjoy it for a short period of time, and then they will probably be forced to make some concessions under the pressure of all parties.Because it is too far ahead, if it is only a little bit ahead, it will not be like this.

The software code is spread out on the compiler, and no one can understand it. 60% of the algorithm of this software is brand new, and the mathematical model to realize the algorithm is unheard of or unseen.

Echo Researcher: Shocked my whole family.

This is also one of the reasons why Lou Qingyan is willing to exchange technology with Echo instead of doing it himself.He couldn't even keep the special effects plug-in, how could he keep the epoch-making speech synthesis?If such a hanging technology is grasped before it is fledgling, it is almost impossible not to be annexed.

What Deep Space Technology is most afraid of is being swallowed up by giants. Don’t you see how much BAT loves to play the game of big fish eating small fish.In order for Deep Space to insist on not raising funds, Lou Qingyan tried his best to create a myth of 20 billion, so naturally he would not allow such a big threat to appear.

Echo is different. The company has already grown in size, and it can't be swallowed if it wants to.

However, didn't Lou Qingyan suffer a big loss when he handed over this epoch-making technology to other companies?

President Echo was also thinking about this question until Cheng Tao said to him: "Have you forgotten, we signed a VAM agreement with him."

It is not only a VAM agreement, but also an unprecedented "technology VAM".

If the software that Deep Space finally handed over to Echo can complete the xx function, Echo must fulfill the xx clause, such a gambling agreement.

At that time, Echo CEO thought it was just a small outsourcing project, and he didn't pay much attention to it. Thinking about it now, he was quite shocked, "Then what terms did we agree to?"

Cheng Tao lowered his head deeply.

At the beginning, seeing those terms were so unrealistic, he thought that Shen Kong was not sure whether it could develop specific functions, so he just tried to write as high as possible.After all, for gambling, the higher the upper limit, the better. What if a miracle happens?

Who knows, what is a miracle to him is all a matter of course to fire.

It is enough for the president of Echo to just listen to the first clause, "What do you mean give him 10% of the technology-related profits? Isn't this just taking away 10% of our company's shares out of thin air???"

It’s not that he’s exaggerating. As soon as the technology of speech synthesis appears, all Echo’s development teams must learn it, and then apply what they have learned to other projects.All products of Echo Technology must be updated under the guidance of this technology.Lou Qingyan opened his mouth to ask for 10% of technology-related profits, which is no different from taking 10% of their shares.

Cheng Tao had no choice but to say: "There is a time limit, there is a time limit..."

Compared to this one, the other clauses are nothing. The president glanced at it, lost his temper, and said gently: "This is Fanxing.com, which asks us to help support him? What's the difference between this and our own platform website?" ?”

He is really gentle.It was Cheng Tao who didn't dare to answer.

These are all things in the past, time goes back to the present, speech synthesis and Fanxing.com will be jointly released, and Cheng Tao is supervising the preliminary work.

Yes, only Cheng Tao.As for the fire, who is that?

Deep space?nonexistent.The publicity and distribution work is all done by Echo, and Deep Space just sits back and enjoys the benefits.

While sitting there waiting for time, Cheng Tao thought of Lou Qingyan.

A rich second generation who lost his inheritance rights and was kicked out of the house, he was once the number one online celebrity on the entire platform.

Once upon a time, his life was to show off his wealth, throw money, eat, drink and have fun.He has a famous saying: "Every morning when I wake up, I have two worries. The first worry is how to burn money today, and the second worry is who to hate today."

Lou Qingyan's character is actually not stained, and there is no dark history that cannot be tolerated.He was just a little frightened, a little naive, and a little angry.

But his life is too enviable.Do whatever you want, do what you like, never have to tolerate things that don't go your way, scold anyone who doesn't agree with you, and be pampered with all your strength by a father who accuses you... Everyone will have a little bit of jealousy or hatred, so his Black powder is overwhelming.

So when he hit rock bottom, even though he was the victim, people just gloated.

Lou Qingyan's social platform is no longer updated, and there are still countless people chasing after him every day to ridicule, using words that make their eyes dirty when they see it.

They used this kind of taunting to vent the anger they had accumulated all day, as if Lou Qingyan's disappearance was their victory against reality.

Twelve o'clock, the time is up.The official website of the software is released, the computer terminal and mobile app are linked to the application store, the advertiser sends an ok gesture, and Fanxing.com is officially launched.

Cheng Tao looked at the posted website page and suddenly got goosebumps all over his body.

No one knows that the downcast second-generation ancestor who has disappeared for a long time is changing the world.

At three o'clock in the afternoon, Shao Yiming arrived at the appointed place on time, waiting for the meeting with the investor.

However, he only waited for a notice of being released.

"Sorry, our manager has something to do and we can't go there. Let's talk about the investment when we have a chance later."

...have a chance to talk again?What chance?

This is clearly a refusal!

It would be fine if it was just one company's rejection. However, in the next few hours, Shao Yiming received rejections from all the reservation companies.

At eight o'clock in the evening, no matter how many times he begged, the last venture capital still didn't come.

He sat alone in the restaurant, blankly, thinking: It's over.

Financing fell through.

All investors who are interested in paid audio have participated in this venture capital conference.If you fail to attract investment at the conference, the hope of attracting investment in the future will be slim.

Paid audio is a new hot spot. There are countless pioneers in the industry, and there are already successful companies like Everest. If you don’t have money, you can’t spread the business at a high speed. If you can’t grab others, you are doomed to fail.

It's over, it's over.

When the lights came on outside the window, Shao Yiming's face was reflected on the glass. For an hour, his mind was in chaos.

An hour later, he suddenly thought: why?

Why are so many VCs releasing pigeons in succession?Could it be that someone is messing with him behind the scenes?

Before he had time to think carefully, the employee called and yelled at him, "Brother Shao! Did you see it! The sky has changed! What should we do! What should we do!"

"What's going on?" Shao Yiming was extremely calm at the moment, answering the phone, still thinking in his mind, who would take so much effort to trip him up.

"Go and read the news, Brother Shao!!! It's been all afternoon, don't you know!!!"

Shao Yiming quickly hung up the phone and opened the news website, where there was a big headline that cannot be ignored.

"Echo Technology was released today, an epoch-making speech synthesis technology, and my country's artificial intelligence has already led the world!"

He froze for a moment before turning on the news and pulling it down to take a closer look.

The tone of the whole news has a feeling of being obviously very excited but restrained. This editor is also really good. He used very rational and objective words to praise the echo from top to bottom, and introduced the new software without any trace. various functions.

The name of the software is very common, it is called Echo Dubbing, which is exactly the same as before.

It's just that the previous echo dubbing was a web program, but this time there is a client.

"Based on this epoch-making speech synthesis technology, Echo Technology has only released one product, 'Echo Dubbing', and this software alone has already shown astonishing functional effects. The follow-up development of this technology is very worth looking forward to."

After reading the software, Shao Yiming suppressed his doubts and went to the app store to download the mobile app.

The design of the app on the mobile terminal is very simple. It is to input a piece of text and convert it into voice, with a simple parameter debugging function.

AI dubbing has very high hardware requirements. The mobile phone cannot convert too much content at one time, and one input is limited to fifty.

Shao Yiming immediately thought of the animation script that had just passed the audition, and randomly pulled out a sentence from the script and entered it.

After the text is entered, the option to select the tone will pop up. The preset tone includes the most basic child's voice, teenager, youth, middle-aged, and old, each of which is divided into men and women.

Click a timbre to output the sound directly.

It was just a short dialogue, and Shao Yiming turned on each tone and listened to it three times, the more he listened to it, the more he found it unbelievable.

It's speech synthesis, real speech synthesis.

Unlike software singing, speech synthesis is not a simple arrangement and combination of sounds, but also requires natural language processing, recognition of text semantics, and so on.Now, the program recognizes the semantics of a certain sentence, configures reasonable ups and downs, rhythms and tones according to the semantics, and then plays it.

The degree of realism is as high as a real person talking on the phone!

The expression of the program may not be the most beautiful, but it is definitely in line with the context, so that people will not find a sense of violation.

In other words, the software does not have the strength of a top voice actor, but it has the ability of an ordinary voice actor.

Shao Yiming forced himself to be calm and comforted himself by saying, "It's nothing special, it's soulless."Don't be afraid. Don't be afraid.

After all, a machine is a machine. Even if it can dub, it is just a routine imitating the human tone, and it cannot match powerful emotional expression.It's like a bad actor who insists on acting

He randomly selected a tone, clicked to confirm, and came to the next page-emotional selection.

There are four sliders below, which are Joy, Excitement, Anger, and Fear.

When Shao Yiming slid the four sliders, it felt unreal, just like sliding the RBG sliders of the picture.

After sliding, the voice generated in real time showed subtle changes in tone.

There is also a small print at the bottom of this page: "Log in to the computer client to adjust more emotional dimensions. You can also design your own dimensional models, save parameters and create emotional filters."

Shao Yiming hesitated for a moment, ignored it, and clicked OK to go to the next page.

The name of this step is "Audio Liquefaction". Select a parameter and generate a curve on the screen. Smudge with your finger to change the shape of the curve. The vertical axis is the parameter, and the horizontal axis is time.

The volume is a straight line. Use your finger to swipe up a small hill, and the volume of the time period corresponding to the hill will change.

Intonation is a curve, you can use your finger to directly change the rising or falling of the tone of voice.

The tone of voice is a broken line, which can create an emphatic or soft voice.

The position of the sound can be adjusted to produce different effects of breath sound, nasal sound, chest resonance, and dantian sound.

There are also various parameters such as pitch, speech rate, etc.Ordinary dialogues that were originally "ordinary and soulless" can be adjusted in various directions in this interface, and the output is astonishingly diverse.

The fourth step is the last step. After completing this step, the system generates an mp3 file.

Shao Yiming listened to it over and over again, finally leaned back on the chair, covered his face with his hands, and let out a long sigh.

Except for some discrepancies in timbre, the whole sentence was exactly as he had imagined.

After a long time, he finally found a reason to comfort himself: "...the fourth step is too complicated, and I'm afraid I can play it for a year if I have difficulty choosing. If I tune it every sentence like this, can I complete a dub in the year of the monkey?"

Powerful is powerful, but the efficiency is reduced.

As soon as this idea came up, a prompt popped up on the app page: "Do you feel that the parameters are too detailed, creating audio is too troublesome and time-consuming? Download the pc client and experience efficient AI dubbing."

Shao Yiming immediately caught the key point: Are the functions on the PC side more powerful?

This mobile terminal advertises the PC terminal all the time, and it is obviously just a promotional trial version of the PC terminal.

The mobile app is already so powerful, how terrifying should the complete functions of the PC end be?

Shao Yiming got up immediately, he couldn't wait to go home, he found the nearest Internet cafe, and downloaded the software directly after turning on the computer.

Compared with the simplicity of the mobile terminal, the PC terminal is too bloated.Under the speed of fiber optic network, it took a full hour to download the software, and another half an hour to install it before it was finally done.

This is a paid software, full-featured trial for one week, simple function trial for one month.

Shao Yiming couldn't wait to go in and found that it was indeed much more powerful than the mobile terminal.

The text input method has changed to importing text files, there is no limit of [-] characters, and thousands of words of text can be imported at a time.

He directly logged into the cloud disk to download his own script, and selected a short paragraph to import into it.

After the text is read in, it will be displayed in the blank box on the left, and the system will automatically recognize the text format. Advertisements, novels, and scripts will all be arranged in different ways, corresponding to different window layouts.Just like ps, there will be different window presets according to different functions of drawing, retouching, and graphic design.

As soon as the text is imported, the software will pop up a prompt, "It has been detected that your text type is a script, do you want to switch to radio drama mode?"

After the switch, the entire window layout changed drastically and was divided into five modules.

The upper left is the text timeline, one line of text corresponds to the audio timeline.The lower left is the character window. Several characters automatically recognized from the script have been neatly arranged in it. After clicking on it, there is a custom character card, including the character's timbre, emotional filter, speech rate and tone, etc.

The upper right is the attribute editor, and the lower right is the sound effect material library.

At the bottom of the entire page is a multi-track timeline, which can edit the audio as a whole.

Looking at it this way, this software is already extremely professional, and the UI design skills are not inferior to the adobe series.

The first step is still to select the tone, but instead of outputting the audio directly, after selecting the tone, fill it in the character card.

After Shao Yiming selected the tone of the first character, he realized that he should fill up the character cards first, and then adjust the generated audio parameters.

In addition to the regular speech rate and tone, there are also "emotional filters" options in the character card, which can be preset or customized.There are more than 20 emotional dimensions in the custom window, such as happiness, sympathy, jealousy, disappointment... More than 20 sliders are dazzling.

It is impossible for a character to be happy or sad forever. Shao Yiming filled out the first character, thought for a while, made a copy, added a bracket (lower) after the character name, fine-tuned the parameters, and changed a filter. Used to represent the state of the character when he is unhappy.

Soon, the characters and their clones are created, and the detailed statements can be edited.

The editable parameters of each sentence are displayed in the attribute editor on the right, which are almost exactly the same as the parameters in the character card, and there are also emotional filter options.Shao Yiming immediately realized that the so-called character card was actually a parameter model.

In the attribute editor, the only thing that is different from the character card is the audio liquefaction curve. Like the mobile app, you can make the most subtle adjustments to different time points of a certain sentence.

After editing the characters and sentences, the software automatically synthesizes the audio. This audio is displayed in the sound effect material library at the bottom right, and can be dragged into the multi-track editor for editing.The sound effect material library is linked to the cloud, from which users can download any sound effects they need.Of course, you can also import it yourself.

So far, this software is just an ordinary dubbing software. Its function is nothing more than imitating human voice and refining various adjustable parameters.

If you use this software to make a radio drama, you only need to set the character, input the script and then output the audio.But the radio drama produced in this way is very "standardized", in Shao Yiming's words, it is "ordinary and soulless."

If you fine-tune the radio drama, you can achieve very good results, and even call out the level that professional dubbing can't reach, but that's too slow, and you have to pick it up bit by bit until the end of time.

Shao Yiming thought, is this software a bit tasteless?The works generated by one click are relatively rough, and the efficiency of fine production is not as good as manual dubbing.

Of course, this software can be used to generate a large number of rough UGC works. People’s requirements for UGC works have always been very low, just like the movies made by netizens themselves using games, no matter how rough they are, they still have fans.

But it can only fill the low-end audio market with a large number of rough works, or produce one or two extremely fine "magic works". The real mid-range high-quality content can only be produced by PGC.

This idea just came up, Shao Yiming suddenly discovered that there is another function in the software.

"AI voice changer, what is this?"

He shivered all of a sudden, thinking of AI changing faces, AI changing people... "Fuck, it's not what I thought, is it?"

It was exactly what he thought.

The function of AI voice changing is that you can dub a character by yourself, and the AI system will learn his acoustic data and replace it with the system preset tone.Similar to a voice changer, but not in real time.

After understanding this function, he took off the earphones directly, sat on the chair, and murmured: "Finished."

The last bug of the software is also patched.

Low precision production efficiency?No, we can change voices with AI.If you think it is troublesome to adjust the parameters, you can dub it yourself, and then replace it with the voice of the character.

Shao Yiming is in the business of audio content operation, and he immediately figured out how earth-shaking changes will take place in the production mode of radio dramas after the release of this software.

In the low-end market, as long as the characters are set, the script is input, and sound effects are added appropriately, radio dramas can be generated with one click.This kind of radio drama is more routine, with all the tone and tone that should be there, but it lacks expressiveness.It's like an idol drama with poor acting.

In the mid-end market, radio dramas are still generated according to the above-mentioned model, but in the more critical and expressive sections, dubbing by voice actors can be performed, and then replaced with the voice of the character.It is also possible to replace the whole dubbing. In short, as long as one voice actor is enough, one person can complete the work of a whole crew.This kind of radio drama is like an ordinary TV drama with occasional bursts of acting skills.

In the high-end market, very advanced expressions can be achieved by fine-tuning radio dramas with software.This kind of radio drama is probably like a big movie full of acting skills, a "magic work" in the general sense.

There are still three kinds of talents needed for a radio drama-screenwriter, director, and dubbing, but the team has shrunk sharply, and only three people are needed at most.

A screenwriter is enough to complete a low-end series by himself, plus a voiceover, can complete a mid-range or high-end series, and it would be even better if the screenwriter himself is the voiceover.As for the director, dispensable.

Two people can do it, and one person can do it too. Super efficient production of radio drama content, this is, this is...

"This is the UGC model of paid audio..."

Prev Index Next

Tap the screen to use advanced tools Tip: You can use left and right keyboard keys to browse between chapters.