In MongoDB it is suggested to turn
To run off
Add
Yours may be a little bit different. Now reboot your server using
Now you can check it by
In the next post we test this using
#mongodb #mongo #noatime #atime #xfs #linux #fstab #mount
atime
to off. atime
is set by Linux
on each file accessed by applications. It is reported repeatedly that turning it off will improve disk performance on that partition.To run off
atime
you need to set noatime
on the partition you are placing mongoDB database files. Open /etc/fstab
and look for your desired partition (mine is `/var`):/dev/mapper/mongo--vg-var /var xfs defaults 0 2
Add
noatime
after defaults
:/dev/mapper/mongo--vg-var /var xfs defaults,noatime 0 2
Yours may be a little bit different. Now reboot your server using
reboot --reboot
. Now you can check it by
mount -l
whether noatime
is set or not:/dev/mapper/mongo--vg-var on /var type xfs (rw,noatime,attr2,inode64,logbsize=256k,sunit=512,swidth=1024,noquota)
In the next post we test this using
touch
command in Linux
.#mongodb #mongo #noatime #atime #xfs #linux #fstab #mount
Have you seen that when you want to query from
If you enter the above command in
#mongodb #mongo #shellBatchSize #limit
MongoDB
in shell it just prints the last 10 records and prompts to enter it
in order to see more? Well in MongoDB shell you can issue the below command to say how many records to return:DBQuery.shellBatchSize = 3000
If you enter the above command in
MongoDB
shell and use find
to query a collection that has more than 3000 documents, 3000 documents will be returned at once.#mongodb #mongo #shellBatchSize #limit
Now to make you MongoDB client connection secure just pass
When you run this script check your mongoDB logs (usually in
Now remove
It says that SSL mode in mongo config is set to
YOU NEED TO BE CAUTIOUS that we have created our SSL ourselves and it is vulnerable to man in the middle attack. For production usage purchase you SSL/TLS certifcate.
#mongodb #mongo #ssl #pymongo
ssl=True
:# test_mongodb_ssl.py
client = pymongo.MongoClient('example.com', ssl=True)
When you run this script check your mongoDB logs (usually in
/var/log/mongodb/mongod.log`). The thing you should take into account is that when you pass `ssl=True
parameter to MongoClient
you just should see the below log (ip addresses wil vary):I NETWORK [listener] connection accepted from 172.15.141.162:50761 #49 (39 connections now open)
I NETWORK [conn49] end connection 172.15.141.162:50761 (38 connections now open)
Now remove
ssl=True
from MongoClient
or pass ssl=False
. If you now run your test script, you would see something like below in mongod.log
:I NETWORK [listener] connection accepted from 172.15.141.162:50762 #50 (39 connections now open)
I NETWORK [conn50] SSL mode is set to 'preferred' and connection 50 to 172.15.141.162:50762 is not using SSL.
It says that SSL mode in mongo config is set to
preferSSL
and your new connection to mongo is not using it.YOU NEED TO BE CAUTIOUS that we have created our SSL ourselves and it is vulnerable to man in the middle attack. For production usage purchase you SSL/TLS certifcate.
#mongodb #mongo #ssl #pymongo
If you have followed our
To make the procedure automatic I have created a sample shell script that after automatic renewal will also renew the PEM files for
#mongodb #mongo #ssl #pem #openssl #lets_encrypt
MongoDB
SSL configuration, you should by now know that we can generate SSL certificate using lets encrypt. I have used dehydrated that fully matches with cloud flare.To make the procedure automatic I have created a sample shell script that after automatic renewal will also renew the PEM files for
MongoDB
#! /bin/bash
echo 'Binding new mongo private key PEM file and Cert PEM file...'
cat /etc/dehydrated/certs/mongo.example.com/privkey.pem /etc/dehydrated/certs/mongo.example.com/cert.pem > /etc/ssl/mongo.pem
echo 'Saved the new file in /etc/ssl/mongo.pem'
sudo touch /etc/ssl/ca.pem
sudo chmod 777 /etc/ssl/ca.pem
echo 'truncate ca.pem file and generate a new in /etc/ssl/ca.pem...'
sudo truncate -s 0 /etc/ssl/ca.pem
echo 'generate a ca.pem file using opessl by input -> /etc/ssl/ca.crt'
sudo openssl x509 -in /etc/ssl/ca.crt -out /etc/ssl/ca.pem -outform PEM
echo 'ca.pem is generated successfully in /etc/ssl'
echo 'append the chain.pem content to newly created ca.pem in /etc/ssl/ca.pem'
sudo cat /etc/dehydrated/certs/mongo.example.com/chain.pem >> /etc/ssl/ca.pem
echo 'done!'
#mongodb #mongo #ssl #pem #openssl #lets_encrypt
Today I fixed a really C**Py bug which have been bugged me all days and years, nights and midnights!
I use a scheduler to to get data from
I found the below parameter that you can set on your
When you don't set it it means no timeout! So I set it to 20000 Ms (20 Sec) in order to solve this nasty problem.
#mongodb #mongo #socketTimeoutMS #timeout #socket_timeout
I use a scheduler to to get data from
MongoDB
and one the servers is outside of Iran and another in Iran. When I want to get data sometimes querying the db takes forever and it freezes the data gathering procedure. I had to restart (like windows) to reset the connection. I know it was stupid! :|I found the below parameter that you can set on your
pymongo.MongoClient
:socketTimeoutMS=10000
socketTimeoutMS
: (integer or None) Controls how long (in milliseconds) the driver will wait for a response after sending an ordinary (non-monitoring) database operation before concluding that a network error has occurred. Defaults to `None`
(no timeout).When you don't set it it means no timeout! So I set it to 20000 Ms (20 Sec) in order to solve this nasty problem.
#mongodb #mongo #socketTimeoutMS #timeout #socket_timeout
In
To read more about
- http://docs.grafana.org/features/datasources/mysql/#using-mysql-in-grafana
#mongodb #mongo #mysql #grafana #dashboard #chart
Grafana
if you are connected to MySQL
you need to provide 3 value in your select query. One is time which must be called time_sec
, the other is countable value which must be called value
and the other is the label that is displayed on your graph which must be called metric
:SELECT
UNIX_TIMESTAMP(your_date_field) as time_sec,
count(*) as value,
'your_label' as metric
FROM table
WHERE status='success'
GROUP BY your_date_field
ORDER BY your_date_field ASC
To read more about
Grafana
head over here:- http://docs.grafana.org/features/datasources/mysql/#using-mysql-in-grafana
#mongodb #mongo #mysql #grafana #dashboard #chart
Months ago we have talked about how to get mongoDB data changes. THe problem with that article was that if for any reason your script
was stopped you will lose the data in the downtime period.
Now we have a new solution that you will read from the point in time that have read last time. MongoDB uses bson Timestamp in order for its internal usage like replication oplog logs. We can use the same Timestamp and store it somewhere to read from the exact point
that we have read last time.
In python you can import it like below:
Now to read data from that point read that time stamp from where you have saved it and query the oplog from that point:
After traversing cursors and catching mongoDB changes you can store the new timestamp that resides in
Now use a
If you remember from before we got last changes by the query below:
We read the last ts and read from the last record, that's why we were missing data.
#mongodb #mongo #replication #oplog #timestamp #cursor
was stopped you will lose the data in the downtime period.
Now we have a new solution that you will read from the point in time that have read last time. MongoDB uses bson Timestamp in order for its internal usage like replication oplog logs. We can use the same Timestamp and store it somewhere to read from the exact point
that we have read last time.
In python you can import it like below:
from bson.timestamp import Timestamp
Now to read data from that point read that time stamp from where you have saved it and query the oplog from that point:
ts = YOUR_TIMESTAMP_HERE
cursor = oplog.find({'ts': {'$gt': ts}},
cursor_type=pymongo.CursorType.TAILABLE_AWAIT,
oplog_replay=True)
After traversing cursors and catching mongoDB changes you can store the new timestamp that resides in
ts
field in the document you have fetched from MongoDB oplog.Now use a
while True
and read data until cursor is alive. The point of this post is that you can store ts somewhere and read from the point you have stored ts.If you remember from before we got last changes by the query below:
last = oplog.find().sort('$natural', pymongo.DESCENDING).limit(1).next()
ts = last['ts']
We read the last ts and read from the last record, that's why we were missing data.
#mongodb #mongo #replication #oplog #timestamp #cursor
In order to get a random document from MongoDB collection you can use aggregate framework:
Read more here: https://www.mongodb.com/blog/post/how-to-perform-random-queries-on-mongodb
This method is the fastest and most efficient way of getting random data from a huge database like 100 M records.
#mongodb #mongo #aggregate #sample #random
db.users.aggregate( [ { $sample: { size: 1 } } ] )
NOTE:
MongoDB 3.2 introduced $sample
to the aggregation pipeline.Read more here: https://www.mongodb.com/blog/post/how-to-perform-random-queries-on-mongodb
This method is the fastest and most efficient way of getting random data from a huge database like 100 M records.
#mongodb #mongo #aggregate #sample #random
MongoDB
How to Perform Random Queries on MongoDB | MongoDB Blog
in
most important part if this scenario is when you are using micro service architecture and you have tens of modules which works independently from each other and send their requests to
Now if you look at the MongoDB log you would see:
In the above log you would see
#mongodb #mongo #pymongo #appname
pymongo
you can give name to your connections. This definitely helps to debug issues or trace logs when seeing mongoDB logs. Themost important part if this scenario is when you are using micro service architecture and you have tens of modules which works independently from each other and send their requests to
MongoDB
:mc = pymongo.MongoClient(host, port, appname='YOUR_APP_NAME')
Now if you look at the MongoDB log you would see:
I COMMAND [conn173140] command MY_DB.users appName: "YOUR_APP_NAME" command: find { find: "deleted_users", filter: {}, sort: { acquired_date: 1 }, skip: 19973, limit: 1000, $readPreference: { mode: "secondaryPreferred" }, $db: "blahblah" } planSummary: COLLSCAN keysExamined:0 docsExamined:19973 hasSortStage:1 cursorExhausted:1 numYields:312 nreturned:0 reslen:235 locks:{ Global: { acquireCount: { r: 626 } }, Database: { acquireCount: { r: 313 } }, Collection: { acquireCount: { r: 313 } } } protocol:op_query 153ms
In the above log you would see
YOUR_APP_NAME
.#mongodb #mongo #pymongo #appname
How to ignore extra fields for schema validation in
Some records currently have extra fields that are not included in my model schema (by error, but I want to handle these cases). When I try to query the DB and transform the records into the schema, I get the following error:
For ignoring this error when having extra fields while getting data, set
#mongodb #mongo #python #mongoengine #strict #FieldDoesNotExist
Mongoengine
?Some records currently have extra fields that are not included in my model schema (by error, but I want to handle these cases). When I try to query the DB and transform the records into the schema, I get the following error:
FieldDoesNotExist
The field 'X' does not exist on the document 'Y'
For ignoring this error when having extra fields while getting data, set
strict
to False
in your meta dictionary.class User(Document):
email = StringField(required=True, unique=True)
password = StringField()
meta = {'strict': False}
#mongodb #mongo #python #mongoengine #strict #FieldDoesNotExist
In
It uses
#mongodb #mongo #duplicates #duplication
MongoDB
you can remove duplicate documents based on a specific field:db.yourCollection.aggregate([
{ "$group": {
"_id": { "yourDuplicateKey": "$yourDuplicateKey" },
"dups": { "$push": "$_id" },
"count": { "$sum": 1 }
}},
{ "$match": { "count": { "$gt": 1 } }}
]).forEach(function(doc) {
doc.dups.shift();
db.yourCollection.remove({ "_id": {"$in": doc.dups }});
});
It uses
aggregation
to group by based on the given key then add its _id
into dups
field and its count in count
field. It will project fields with count of more than 1 using $match
. At the end loops over each document and remove all duplicate fields except the first one (`shift` will cause this behaviour).#mongodb #mongo #duplicates #duplication
How to configure a Delayed Replica Set Member?
Let's assume that our member is third in the array of replica members:
The
The
And finally
The use case for this is to have a replication that is used for analytical purposes or used for backup and so on.
#mongodb #mongo #replica #replication #primary #delayed_replica_set #slaveDelay
Let's assume that our member is third in the array of replica members:
cfg = rs.conf()
cfg.members[2].priority = 0
cfg.members[2].hidden = true
cfg.members[2].slaveDelay = 3600
rs.reconfig(cfg)
The
priority
is set to 0 (preventing to be elected as primary).The
hidden
to true in order to hide the node from clients querying the database.And finally
slaveDelay
to number of seconds that we want it to get behind of Primary Node
.The use case for this is to have a replication that is used for analytical purposes or used for backup and so on.
#mongodb #mongo #replica #replication #primary #delayed_replica_set #slaveDelay
How to add self-signed certificates to replica set nodes?
https://medium.com/@rossbulat/deploy-a-3-node-mongodb-3-6-replica-set-with-x-509-authentication-self-signed-certificates-d539fda94db4
#mongo #mongodb #ssl #self_signed #openssl
https://medium.com/@rossbulat/deploy-a-3-node-mongodb-3-6-replica-set-with-x-509-authentication-self-signed-certificates-d539fda94db4
#mongo #mongodb #ssl #self_signed #openssl
Medium
Deploy a 3-Node MongoDB 4.0 Replica Set with X.509 Authentication + Self Signed Certificates
This article will guide you through the process of setting up a MongoDB cluster that will utilise X.509 authentication with self signed…
In order to see how much time your mongoDB slave is behind the primary node:
#mongodb #mongo #slave #printSlaveReplicationInfo #replica #replication
rs0:SECONDARY> db.printSlaveReplicationInfo()
source: mongo.mongo.com:27017
syncedTo: Mon Nov 12 2018 06:33:40 GMT+0000 (UTC)
-4 secs (0 hrs) behind the primary
#mongodb #mongo #slave #printSlaveReplicationInfo #replica #replication
MongoDB server Load Average: 0.5 (It can reach 16)
Database Size: 100GB (It is compressed in MySQL it reaches 300 GB in size!)
Req/Sec: 500
Our server seems hungry for more requests and more data.
#mongodb #mongo #awesomeness
Database Size: 100GB (It is compressed in MySQL it reaches 300 GB in size!)
Req/Sec: 500
Our server seems hungry for more requests and more data.
#mongodb #mongo #awesomeness