hasmeister.blogg.se - Encoding in a sentence

With open('radiologicalreport.txt', 'r') as myfile: from numpy import arrayįrom sklearn.preprocessing import LabelEncoderįrom sklearn.preprocessing import OneHotEncoderįrom nltk.tokenize import TweetTokenizer, sent_tokenize However, my goal is to one hot encoding per sentence in a numpy array. Hot_encode=pd.Series(sent_text).str.get_dummies(' ') If I want to one hot encode the full text, I can easily do it using these two lines. The problem I am facing is I want to one hot encode per sentence in a numpy array to be able to feed it into LDA. Before that, I want to do one hot encoding to the text. My ultimate goal is to apply LDA to classify each sentence to one topic. Previous study: (other hospital) Findings: Lung parenchyma: The study reveals evidence of apicoposterior segmentectomy of LUL showing soft tissue thickening adjacent surgical bed at LUL, possibly post operation.

Technique: Plain and enhanced-MPR CT chest is performed using 2 mm interval. MDCT OF THE CHEST History: A 58-year-old male, known case lung s/p LUL segmentectomy. I have sentences stored in text file which looks like this.