mardi 14 juin 2016

Trying to append values to a numpy array of vector features

I have a set of feature vectors for sentences I have obtained using:

with open(sys.argv[1]) as trainingSentences:
    sentence2region2value = json.loads(trainingSentences.read())

train_wordlist = []

for sentence,locations in sentence2region2value.iteritems():
      train_wordlist.append(" ".join(sentence_to_words(sentence, True)))

vectorizer = CountVectorizer(analyzer = "word",   
                             tokenizer = None,    
                             preprocessor = None, 
                             stop_words = None,   
                             max_features = 5000)

train_data_features = vectorizer.fit_transform(train_wordlist)

train_data_features = train_data_features.toarray()

I want to also add the label for all of these 492 feature vectors for a logistic regression. This "prediction" label is contained in the sentence2region2value dictionary:

{sentence: Y
   {parsedsentence: Z
        {prediction: X,
             location-values:{"Qatar": [32,221,31]},{"Dubai": [12,123,421]},.....}

Currently I am trying to use this:

for prediction in sentence2region2value["sentence"]["parsedsentence"].iteritems():
      for i in train_data_features:
            train_data_features[i] = np.append(train_data_features[i],np.array(prediction))

But it isn't working. Any ideas?

Aucun commentaire:

Enregistrer un commentaire