1. make a backup of database
1st, open a terminal in jupyter
$ mongod
$ mongo
2nd, select the db you will use
$ show dbs
$ use awms
3rd, select which table you will use
$ show tables
4th, create a backup table and copy the data from orgin one to the one newly created
$ db.createCollection('loginfo_backup')
$ db.loginfo.copyTo('logdata_backup')
5th, use backup db when sort data
import pymongo
client = pymongo.MongoClient('localhost', 27017)
awms = client['awms']
loginfo = awms['loginfo_backup']
2. String of ‘\xa0’ in [‘caseid’]
1st, use this codes to find out which one has ‘\xa0’
for i in loginfo.find():
if '\xa0' in i['caseid']:
print(i['caseid'])
2nd, delete this string of data, you need to j = "".join(j.split())
,or j = ''.join(j.split())
refer to :https://www.cnblogs.com/yqpy/p/8203783.html
for i in loginfo.find():
if '\xa0' in i['caseid']:
j = i['caseid']
j = "".join(j.split())
if '\xa0' in j:
print('yes')
else:
print('no')
3rd, update the refreshed data to db
for i in loginfo.find():
if '\xa0' in i['caseid']:
j = i['caseid']
j = ''.join(j.split())
loginfo.update({'_id':i['_id']},{'$set':{'caseid':j}})
3. Make data visible by highcharts
import charts package
import charts
shift + enter, it runs well if the following come out‘Server running in the folder /Users/osx at 127.0.0.1:65060’
firstly, let’s see a sample
series = [{
'name': 'OS X',
'data': [11],
'type': 'column'
},{
'name': 'windows',
'data': [12],
'type': 'column'
},{
'name': 'other',
'data': [2],
'type': 'column'
}]
charts.plot(series, show='inline', options=dict(title=dict(text='charts are nice!')))
NOW, let’s start to make a chart to show the volume of each city 1st, unduplicated city names
city_list = []
for i in loginfo.find():
city_list.append(i['city'])
city_index = list(set(city_list))
print(city_index)
# unduplicated city索引, 集合set可自动去重 ![3.png](http://user-image.logdown.io/user/42937/blog/39533/post/4734411/i4cAurYwR3GMDsjSbT4U_3.png)
2nd, volumes of each unduplicated city
post_times = []
for index in city_index:
post_times.append(city_list.count(index))
print(post_times)
# count each unduplicated city in the duplicated city_list
3rd, make a data generator for making a chart
def data_gen(types):
length = 0
if length <= len(city_index):
for city, times in zip(city_index, post_times):
data = {
'name': city,
'data': [times],
'type': types
}
yield data
length += 1
4th, make the chart
series = [data for data in data_gen('column')]
charts.plot(series, show='inline', options=dict(title=dict(text='case city')))