Lexicon learning-based heliumspeech unscrambling method in saturation diving
12094482 ยท 2024-09-17
Inventors
- Shibing ZHANG (Nantong, CN)
- Jianrong WU (Nantong, CN)
- Lili GUO (Nantong, CN)
- Ming LI (Nantong, CN)
- Zhihua BAO (Nantong, CN)
Cpc classification
International classification
G10L21/00
PHYSICS
G10L15/06
PHYSICS
Abstract
The present application relates to a lexicon learning-based heliumspeech unscrambling method in saturation diving. In a system including divers, a correction network, and an unscrambling network, a common working language lexicon for saturation diving operation is established and is read by the divers respectively in different environments, to generate supervision signals and vector signals of the correction network, and the correction network learns heliumspeeches of the different divers at different diving depths to obtain a correction network parameter, and corrects a heliumspeech of a diver to obtain a corrected speech; and the unscrambling network learns the corrected speech and completes unscrambling of the heliumspeech.
Claims
1. A lexicon learning-based heliumspeech unscrambling method in saturation diving, applicable to a system comprising at least one diver i, one heliumspeech correction network, and one heliumspeech unscrambling network, wherein a heliumspeech signal of the diver i is S, and the heliumspeech unscrambling method comprises the following steps: step 1. lexicon signal construction-constructing a common working language lexicon K of the diver i for saturation diving operation according to saturation diving specifications; step 2. supervision signal generation-reading, by the diver i, words in the lexicon K in a normal atmospheric environment to obtain supervision signals X.sub.i, to generate a supervision signal set X={X.sub.i} of the heliumspeech correction network for machine learning, wherein i=1, 2, . . . , I, and I is a number of divers; step 3. vector signal generation-reading, by the diver i, the words in the lexicon K respectively in environments corresponding to saturation diving depths h.sub.1, h.sub.2, h.sub.3, . . . , h.sub.L, to obtain vector signals Y.sub.i,l, wherein l=1, 2, . . . , L, and L is a number of heliumspeech test points, to generate a vector signal set Y={Y.sub.i,l} of the heliumspeech correction network for machine learning; step 4. learning of the heliumspeech correction network-performing, by the heliumspeech correction network, supervised learning by using the vector signals Y.sub.i,l as input signals and the supervision signals X.sub.i as expected output signals to form a correction network parameter set C={C.sub.i,l} corresponding to the vector signals Y.sub.i,l; step 5. correction network parameter selection-fitting the heliumspeech signal S of the diver i during saturation diving operation with all the vector signals Y.sub.i,l in the vector signal set Y, and selecting a parameter C.sub.n,l corresponding to a vector signal Y.sub.n,l having a highest fitness as a correction network parameter; step 6. heliumspeech correction-correcting the heliumspeech signal S by using the heliumspeech signal S as an input signal of the heliumspeech correction network, to generate a corrected speech signal T; step 7. learning of the unscrambling network-comparing speeches in the corrected speech signal T with speeches in the supervision signals in the supervision signal set X of the heliumspeech correction network for machine learning word by word, to calculate fitnesses therebetween; selecting, from the supervision signal set X, speeches corresponding to words having the highest fitnesses; matching the selected speeches with speeches corresponding to the words in the corrected speech signal T into groups; sorting, in descending order by fitness, the matched speeches of the groups; selecting a top p % of the groups in orders of the fitness; taking speeches in the corrected speech signal T in the selected top p % of the groups as a vector signal U of the unscrambling network for machine learning and taking speeches corresponding to words in the supervision signal set X in the selected top p % of the groups as a supervision signal V of the heliumspeech unscrambling network for machine learning, and performing, by the unscrambling network, supervised learning; and step 8. heliumspeech unscrambling-unscrambling the heliumspeech signal S by using the corrected speech signal T as an input signal of the heliumspeech unscrambling network.
2. The lexicon learning-based heliumspeech unscrambling method in saturation diving according to claim 1, wherein in step 5 and step 7, an evaluation indicator of the fitness is a Euclidean distance or a variance, a smaller Euclidean distance indicates a higher fitness, and a smaller variance indicates a higher fitness.
3. The lexicon learning-based heliumspeech unscrambling method in saturation diving according to claim 1, wherein the common working language lexicon K of the diver i for saturation diving operation is set by using a heliumspeech unscrambler according to saturation diving specifications of a unit.
4. The lexicon learning-based heliumspeech unscrambling method in saturation diving according to claim 1, wherein heliumspeech test point depths h.sub.1, h.sub.2, h.sub.3, . . . , h.sub.t evenly cover a preset depth of salvaging and diving operation.
5. The lexicon learning-based heliumspeech unscrambling method in saturation diving according to claim 4, wherein a number of test points is determined according to the preset depth of salvaging and diving operation and a spacing between the test points.
6. The lexicon learning-based heliumspeech unscrambling method in saturation diving according to claim 1, wherein in step 2, when a supervision signal of the correction network is a word label, the lexicon K is directly used as the supervision signal X; and correspondingly, the corrected speech signal T generated in step 6 is also word, and an unscrambled heliumspeech signal generated in step 8 is word.
7. The lexicon learning-based heliumspeech unscrambling method in saturation diving according to claim 1, wherein if a number of words in the lexicon K is between 100 and 300, a value of p is selected from between 85 and 98.
8. The lexicon learning-based heliumspeech unscrambling method in saturation diving according to claim 1, wherein the learning methods used in step 4 and step 7 are a K-nearest neighbor algorithm and a decision tree algorithm, or a self-training algorithm and a semi-supervised support vector machine algorithm.
9. The lexicon learning-based heliumspeech unscrambling method in saturation diving according to claim 1, wherein a distortion recognition is performed on the speech of the diver i, and if a distortion is relatively low, the corrected speech signal T is directly output as an unscrambled heliumspeech signal.
10. The lexicon learning-based heliumspeech unscrambling method in saturation diving according to claim 1, wherein step 1 to step 4 are performed by the diver i in a submersible, and step 5 to step 8 are performed by the diver i during deep-sea diving operation.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
DETAILED DESCRIPTION OF THE EMBODIMENTS
(2) The present application is further described below with reference to the accompanying drawings and specific embodiments.
(3) In a system including a diver, a heliumspeech correction network, and a heliumspeech unscrambling network, first, a common working language lexicon of divers is established according to saturation diving specifications, the divers read the working language lexicon aloud respectively in a normal atmospheric environment and environments corresponding to saturation diving operation, to generate supervision signals and vector signals of the correction network for machine learning, and the correction network learns heliumspeeches of the different divers at different diving depths by using a supervised learning algorithm, to obtain a correction network parameter set; second, during diving operation, the divers fit their heliumspeech signals with the vector signals of the correction network, select a network parameter corresponding to a vector signal having the highest fitness as a correction network parameter, and correct a heliumspeech of a diver to obtain a corrected speech signal; then, the corrected speech signal is fitted with the common working language lexicon, and is filtered in descending order by fitness, to generate supervision signals and vector signals of the unscrambling network for machine learning, and the unscrambling network further learns the corrected speech signal by using the supervised learning algorithm; and finally, the unscrambling network unscrambles the corrected speech signal, to complete perfect unscrambling of the heliumspeech.
(4) First StageCorrection Network Learning
(5) Step 1. Lexicon signal construction: Construct a common working language lexicon K of the diver for saturation diving operation according to saturation diving specifications.
(6) In this embodiment, according to saturation diving specifications of the XX Salvage Bureau, a common working language lexicon K including 150 words such as diving, deck, temperature, and pressure. Step 2. Supervision signal generation: the diver i reads words in the lexicon K aloud in a normal atmospheric environment to obtain supervision signals X.sub.i, so as to generate a supervision signal set X={X.sub.i} of the correction network for machine learning, where i=1, 2, . . . , I, and I is a number of the divers.
(7) In this embodiment, two divers respectively read words in the lexicon K aloud, to generate a supervision signal set, X.sub.1 (speech signal) and X.sub.2 (speech signal), of the correction network for machine learning. Step 3. Vector signal generation: The diver i reads the words in the lexicon K aloud respectively in environments corresponding to saturation diving depths h.sub.1, h.sub.2, h.sub.3, . . . , h.sub.L to obtain vector signals Y.sub.i, 1, where l=1, 2, . . . , L, and L is a number of the heliumspeech test points, so as to generate a vector signal set Y={Y.sub.i, 1} of the correction network for machine learning.
(8) In this embodiment, a saturation diving depth ranges from 200 m to 250 m, a spacing between test points is 10 m, and the two divers respectively read the words in the lexicon K in the submersible in environments corresponding to saturation diving depths of 200 m, 210 m, 220 m, 230 m, 240 m, and 250 m, to generate vector signals (speech signal) Y.sub.1, 1, Y.sub.1, 2, Y.sub.1, 3, Y.sub.1, 4, Y.sub.1, 5, Y.sub.1, 6, Y.sub.2, 1, Y.sub.2, 2, Y.sub.2, 3, Y.sub.2, 4, Y.sub.2, 5, and Y.sub.2, 6 of the correction network for machine learning. Step 4. Learning of the correction network: The correction network performs supervised learning by using the vector signals Y.sub.i, 1 as input signals and the supervision signals X.sub.i as expected output signals to form a correction network parameter set C={C.sub.i, 1} corresponding to the vector signals Y.sub.i, 1.
(9) In this embodiment, the correction network performs supervised learning by using a K-nearest neighbor algorithm. After the supervised learning, the correction network generates corresponding correction network parameters C.sub.1, 1, C.sub.1, 2, C.sub.1, 3, C.sub.1, 4, C.sub.1, 5, C.sub.1, 6, C.sub.2, 1, C.sub.2, 2, C.sub.2, 3, C.sub.2, 4, C.sub.2, 5, and C.sub.2, 6 in correspondence to the different vector signals Y.sub.1, 1, Y.sub.1, 2, Y.sub.1, 3, Y.sub.1, 4, Y.sub.1, 5, Y.sub.1, 6, Y.sub.2, 1, Y.sub.2, 2, Y.sub.3, 3, Y.sub.4, 4, Y.sub.5, 5, and Y.sub.6, 6 and the supervision signals X.sub.1 and X.sub.2. When input vector signals of the correction network are Y.sub.1, 1, Y.sub.1, 2, Y.sub.1, 3, Y.sub.1, 4, Y.sub.1, 5, and Y.sub.1, 6, their supervision signal is X.sub.1. When input vector signals of the correction network are Y.sub.2, 1, Y.sub.2, 2, Y.sub.2, 3, Y.sub.2, 4, Y.sub.2, 5, and Y.sub.2, 6, their supervision signal is X.sub.2.
(10) Second StageHeliumspeech Unscrambling
(11) Step 5. Correction network parameter selection: Fit the working speech S (heliumspeech) of the diver during normal saturation diving operation with all the vector signals Y.sub.i, 1 in the vector signal set Y, and select a parameter C.sub.n, 1 corresponding to a vector signal Y.sub.n, 1 having the highest fitness as a network parameter of the correction network.
(12) In this embodiment, when the diver 1 is working, a working speech signal of the diver 1, that is, the heliumspeech S, is fitted with all the vector signals Y.sub.1, 1, Y.sub.1, 2, Y.sub.1, 3, Y.sub.1, 4, Y.sub.1, 5, Y.sub.1, 6, Y.sub.2, 1, Y.sub.2, 2, Y.sub.2, 3, Y.sub.2, 4, Y.sub.2, 5, and Y.sub.2, 6 respectively, and the network parameter C.sub.1, 3 corresponding to the vector signal Y.sub.1, 3 having the highest fitness, is selected as the network parameter of the correction network. During the fitting, the Euclidean distance is used as an evaluation indicator. Step 6. Heliumspeech correction: Correct the heliumspeech signal S by using the heliumspeech signal S of an input signal of the correction network (in this case, the network parameter of the correction network is C.sub.n, 1), to generate a corrected speech signal T.
(13) In this embodiment, the correction network parameter adopted by the correction network for correcting the heliumspeech signal S is C.sub.1, 3, and the generated corrected speech signal is T. Step 7. Learning of the unscrambling network: compare speeches in the corrected speech signal T with speeches in the supervision signals in the supervision signal set X of the correction network for machine learning word by word, to calculate fitnesses therebetween; select, from the supervision signal set X, speeches corresponding to words having the highest fitnesses; matching the selected speeches with speeches corresponding to the words in the corrected speech signal T into groups; sort, in descending order by fitness, the matched speeches of the groups; selecting the top p % of the groups in orders of the fitness; take speeches in the corrected speech signal T in the selected top p % of the groups as a vector signal U of the unscrambling network for machine learning and taking speeches corresponding to words in the supervision signal set X in the selected top p % of the groups as a supervision signal V of the unscrambling network for machine learning. The unscrambling network performs supervised learning.
(14) In this embodiment, the corrected speech signal T is compared with the supervision signals in the supervision signal set X of the correction network for machine learning word by word by using the Euclidean distance, speeches corresponding to words having the highest fitnesses with speeches corresponding to the words in the corrected speech signal T are selected from the supervision signal set X, the speeches are matched into groups, the speeches that are matched into groups are sorted in descending order by fitness, speech signals in the corrected speech signal T in the top 90% of matched groups in terms of the fitness are selected as a vector signal U of the unscrambling network for machine learning, and speech signals in the supervision signal set X corresponding thereto are used as supervision signals V of the unscrambling network for machine learning. The unscrambling network performs supervised learning. The unscrambling network performs supervised learning by using the K-nearest neighbor algorithm. Step 8. Heliumspeech unscrambling: Unscramble the heliumspeech S by using the corrected speech signal T as an input signal of the unscrambling network.
(15) In addition to the foregoing embodiments, the present application may further include other implementations. Any technical solution formed through equivalent replacement or equivalent transformation falls within the protection scope claimed in the present application.