Tech C**P

Data Analysis

Create a dataframe from dictionary in Pandas:

import pandas
 data = [{'id': 1, 'name': 'alireza'}, {'id': 2, 'name': 'Mohsen'}]
 # Creating a dataframe from a dictionary object
 df = pandas.DataFrame(data)

Now if you print dataframe:

> df
    id     name
 0   1  alireza
 1   2   Mohsen

NOTE: the first column is the index column.

In order to turn it to a dictionary after your aggregation, analysis, etc just use to_dict like below:

df.to_dict(orient='records')
 [{'id': 1, 'name': 'alireza'}, {'id': 2, 'name': 'Mohsen'}]

You are right! We didn't do anything useful on records, but the goal is to tell you how to turn dataframe to a dictionary not more.

NOTE: on older version of pandas you have to use outtype='records' rather than orient='records'.

#python #pandas #to_dict #outtype #orient #dictionary #dataframe

63 viewsAlireza Hos., 06:28

Tech C**P

Allow outgoing requests through server with CSF (ConfigServer Security & Firewall)

If by debugging you can confirm that your server cannot see external port but other servers can, just add the desired port to CSF with the destination IP address in the format of tcp/udp|in/out|s/d=port|s/d=ip in /etc/csf/csf.allow:

tcp|out|d=1080|d=15.9.8.223

The above line is advanced port+ip filtering which opens 1080 port on 15.9.8.223 destination server. You can just enter onr IP address per line to be allowed through iptables. In order to apply new changes just enter csf -r in command line after saving the above filter.

NOTE: one IP address per line is mandatory.

NOTE: CIDR addressing allowed with a quaded IP (e.g. 192.168.254.0/24)

NOTE: Only list IP addresses, not domain names (they will be ignored)

#linux #csf #iptables

69 viewsAlireza Hos., edited 06:45

Tech C**P

How to find documents with a specific field type?

It may happen to have different field types in a specific field like credit, and some of them are numeric and some of them string. In order to find NumberLong() field types you can use $type:

db.users.find({credit: {$type: "long" }})

If you want to remove those fields, if applicable, use remove instead of find to remove those documents that has wrong types. It is not sensible to do that for users document though, it just gives you the idea.

#mongodb #mongo #type #field_type #remove #find

68 viewsAlireza Hos., edited 08:59

Tech C**P

Get current directory from within the bash script:

SCRIPT_DIR="$( cd "$( dirname "$0" )" && pwd )"
 echo "$SCRIPT_DIR"

dirname gets the current directory and cd will change the current directory, finally pwd will return the current working directory, which in our case will be stored in SCRIPT_DIR.

#linux #bash #script #shell #pwd #current_directory

160 viewsAlireza Hos., 10:20

Tech C**P

By Telegram filtering users are looking for alternative messaging applications. One the well known applications for tech savvy guys is Slack it is filtered but not in a way Telegram is filtered. You can call API without any proxy or socks server. The app itself is filtered, but when you login for the first time by using a proxy server, you do not need socks server anymore as you are logged in all the times.

One of the Slack developer kit for Python is python-slackclient. Installtion is easy as pie:

pip install slackclient

When you get your bot token, all thing you need to do to send a message to a channel is:

from slackclient import SlackClient

 slack_token = "xoxp-YOUR-TOKEN"
 sc = SlackClient(slack_token)

 sc.api_call(
   "chat.postMessage",
   channel="#python",
   text="Hello from Python! 🎉"
 )

#python is the name of your channel. For more information go to link below:
- http://slackapi.github.io/python-slackclient/

#python #slack #python_slackclient #im #telegram

72 viewsAlireza Hos., 12:37

Tech C**P

http://slackapi.github.io/python-slackclient/auth.html#test-tokens

#slack

57 viewsAlireza Hos., 12:39

Tech C**P

Meltdown: the latest news on two major CPU security bugs

Two major computer processor security bugs, dubbed Meltdown and Spectre, affect nearly every device made in the last 20 years. The ramifications of how much these bugs will impact computing is still playing out, but it could lead to compromised servers for cloud platforms and other farther-reaching effects.

#news #bug #meltdown #spectre #cpu

57 viewsAlireza Hos., 11:14

Tech C**P

How to add authentication to MongoDB?

At first you need to create an admin user, so bring up a mongo shell by typing mongo in your terminal and hit enter. The database that users are stored is admin, so switch to admin database:

use admin

Now by using createUser database method we will create a user called myUserAdmin:

db.createUser(
   {
     user: "myUserAdmin",
     pwd: "1234qwer",
     roles: [ { role: "userAdminAnyDatabase", db: "admin" } ]
   }
 )

Disconnect the mongo shell.

The important note about mongo is to run it by --auth argument, otherwise authentication would not work:

mongod --auth --port 27017 --dbpath /data/db1

#mongodb #mongo #auth #authentication #create_user

58 viewsAlireza Hos., 11:27

Tech C**P

What is shard and replication in mongoDB? What is their differences?

MongoDB has 2 concepts that may lead even intermediate programmers to confusion! So let's break it down and explain both in depth.

1- Take a deep breath. :)

2- Replication: replicate means reproducing or making an exact copy of something. In MongoDB replication, mirror all data sets into another server. This is process is used for fault tolerance. If there are 4 mongo servers and your dataset is 1 terabyte, each node in replica-set will have 1 terabyte of data.
In replica-set there is ONE master (primary) node, and one or more slaves (secondary). Read performance can be improved by adding more and more slaves, but not writes! Adding more slaves does not affect writes, that's because all writes goes to master first and then will be propagated to other slaves.

3- Sharding: sharding on the other hand has completely a different concept. If you have a server with 1 terabyte of data and you have 4 servers, then each nore will have 250 gigabyte of data each. As you may have guessed it is not fault tolerant because each part of data resides in a separate server. Each read and write will be sent to the corresponding section. So if you add more shards, both read and write performance will be improved in the cluster. When one shard of the cluster goes down, any data on it is inaccessible. For that reason each member of the cluster should also be a replica-set, but not required to.

4- Take another deep breath, and let's get back to work.

#mongodb #mongo #shard #replica #replication #sharding #cluster

56 viewsAlireza Hos., 12:10

Tech C**P

Migrate a running process into tmux

reptyr is a utility for taking an existing running program and attaching it to a new terminal. Started a long-running process over ssh, but have to leave and don’t want to interrupt it? Just start a screen, use reptyr to grab it, and then kill the ssh session and head on home.

sudo apt-get install -y reptyr    # For Ubuntu users

Send the current foreground job to the background using CTRL-Z.

List all the background jobs using jobs -l. This will get you the PID.

jobs -l
 [1]  + 16189 suspended  vim foobar.rst

Here the PID is 16189.
Start a new tmux or screen session. I will be using tmux:

tmux

Reattach the background process using:

reptyr 16189

If this error appears:

Unable to attach to pid 16189: Operation not permitted
 The kernel denied permission while attaching

Then type in the following command as root.

echo 0 > /proc/sys/kernel/yama/ptrace_scope

#reptyr #tmux #screen #pid

61 viewsAlireza Hos., 14:50

Tech C**P

1. List all Open Files with lsof Command

> lsof
 COMMAND    PID      USER   FD      TYPE     DEVICE  SIZE/OFF       NODE NAME
 init         1      root  cwd      DIR      253,0      4096          2 /
 init         1      root  rtd      DIR      253,0      4096          2 /
 init         1      root  txt      REG      253,0    145180     147164 /sbin/init
 init         1      root  mem      REG      253,0   1889704     190149 /lib/libc-2.12.so

FD column stands for File Descriptor, This column values are as below:
- cwd current working directory
- rtd root directory
- txt program text (code and data)
- mem memory-mapped file

To get the count of open files you can use wc -l with lsof like as follow:

lsof | wc -l

2. List User Specific Opened Files

lsof -u alireza
 COMMAND  PID    USER   FD   TYPE     DEVICE SIZE/OFF   NODE NAME
 sshd    1838 alireza  cwd    DIR      253,0     4096      2 /
 sshd    1838 alireza  rtd    DIR      253,0     4096      2 /

#linux #sysadmin #lsof #wc #file_descriptor

60 viewsAlireza Hos., 13:10

Tech C**P

Mastering Linux Shell Scripting

#ebook #book #shell #scripting #linux #pub

185 viewsAlireza Hos., edited 14:22

Tech C**P

Andrew_Mallett_Mastering_Linux_Shell.epub

6.9 MB

#epub #ebook #linux

190 viewsAlireza Hos., edited 14:23

Tech C**P

Today we encountered slowness on MongoDB that caused all the infrastructure to get affected. The problem was that slowness on some specific mongo queries caused all the other queries to wait. YES we use index and YES we used explained on those queries and saw that those queries are using index. Now to mitigate the issue we had to kill very slow find queries until we fix the issue.

The function below kills slow queries:

function (sec) {db.currentOp()['inprog'].forEach(function (query) {     if (query.op !== 'query') { return; }      if (query.secs_running < sec) { return; }        print(['Killing query:', query.opid,             'which was running:', query.secs_running, 'sec.'].join('   '));     db.killOp(query.opid); })}

We need to save this query in mongo itself and run it directly. To save the above function in mongoDB use db.system.js.save:

db.system.js.save({_id:"kill_slow_queries", value:function (sec) {db.currentOp()['inprog'].forEach(function (query) {     if (query.op !== 'query') { return; }      if (query.secs_running < sec) { return; }        print(['Killing query:', query.opid,             'which   was running:', query.secs_running, 'sec.'].join(' '));     db.killOp(query.opid); })} })

I will explain the above function parts in a different post. Now you need to load server scripts and then run it:

db.loadServerScripts()
 kill_slow_queries(20)

The above query kills queries that has taken longer than 20s.

NOTE: you can create a shell script and run it periodically using crontab until you fix the slowness on your server.

#mongodb #mongo #function #kill_slow_queries #currentOp

57 viewsAlireza Hos., 13:35

Tech C**P

MongoDB has a top utility like top linux command that displays how much time spent on read, write and total on every name space (collection).

To run mongotop you just need to run:

mongotop

The output is something like below:

root@hs-1:~# mongotop
 2018-01-09T13:42:42.177+0000    connected to: 127.0.0.1

                     ns           total    read    write    2018-01-09T13:42:43Z
                 users.profile     28ms    28ms      0ms
               authz.tokens        7ms     7ms      0ms
                mielin.obx         3ms     3ms      0ms
              conduc.contacts      1ms     1ms      0ms
           admin.system.roles      0ms     0ms      0ms

The above query will run every second, to increase the interval use mongotop YOUR_INTERVAL_INSECOND.

If you want the result in json use mongotop --json.

If you want to return the result once and exit use mongotop --row-count

#mongodb #mongo #mongotop #read #write

51 viewsAlireza Hos., 13:47

Tech C**P

On previous posts we explained about query slowness. Here we try to explain different parts of the function.

db.currentOp: in progress operations in mongoDB is displayed by this command. The response of the command is in json format, so you
can use command like db.currentOp()['inprog']. The response has many useful informations like lock status, numYields and so on.
The part we are interested in is opid part. opid is the pid number of the query operation. op section of each operation shows the type of the query. It can be an internal database command, insert command and or query. secs_running of the operation is the part that we can check whether a query has taken a long time or not. It is in second.

db.killOp : killing an operation is just as simple as giving the opid number to killOp as below:

db.killOp(6123213)

This is all we've done in previous posts, to kill slow queries in mongoDB.

#mongodb #mongo #currentOp #killOp #opid

56 viewsAlireza Hos., 14:01

Tech C**P

#ebook #book #mongodb #mongo #pdf

70 viewsAlireza Hos., edited 14:30

Tech C**P

The Definitive Guide to MongoDB.pdf

4.5 MB

#ebook #book #mongodb #mongo #pdf

73 viewsAlireza Hos., edited 14:30

Tech C**P

See live disk IO status by using iostat:

iostat -dx 1

The output has many columns. The part I'm interested in for now is r/s which refers to read per second and w/s which is write per
second. To see size per second in read and write see columns rkB/s, wkB/s in their corresponding order.

NOTE: if you don't have iostat on your linux os install it on debian by issuing apt-get install sysstat command.

#linux #debian #iostat #read_per_second #write_per_second #sysstat

55 viewsAlireza Hos., 12:24

Tech C**P

Benchmark disk performance using hdparm & dd.

In order to get a meaningful result run the test a couple of times.

Direct read (without cache):

$ sudo hdparm -t /dev/sda2
 /dev/sda2:
  Timing buffered disk reads: 302 MB in  3.00 seconds = 100.58 MB/sec

And here's a cached read:

$ sudo hdparm -T /dev/sda2
 /dev/sda2:
  Timing cached reads:   4636 MB in  2.00 seconds = 2318.89 MB/sec

-t: Perform timings of device reads for benchmark and comparison
purposes. For meaningful results, this operation should be repeated
2-3 times on an otherwise inactive system (no other active processes)
with at least a couple of megabytes of free memory. This displays
the speed of reading through the buffer cache to the disk without
any prior caching of data. This measurement is an indication of how
fast the drive can sustain sequential data reads under Linux, without
any filesystem overhead. To ensure accurate measurements, the
buffer cache is flushed during the processing of -t using the
BLKFLSBUF ioctl.

-T: Perform timings of cache reads for benchmark and comparison purposes.
For meaningful results, this operation should be repeated 2-3
times on an otherwise inactive system (no other active processes)
with at least a couple of megabytes of free memory. This displays
the speed of reading directly from the Linux buffer cache without
disk access. This measurement is essentially an indication of the
throughput of the processor, cache, and memory of the system under
test.

You can use dd command to test your hard disk too:

$ time sh -c "dd if=/dev/zero of=ddfile bs=8k count=250000 && sync"; rm ddfile

rm ddfile removes the test file created by dd command of=ddfile. of param stands for output file.

These are some useful and simple disk benchmarking tools.

#linux #benchmark #hdd #dd #hard_disk #hdparm

130 viewsAlireza Hos., edited 13:10

About

Blog

Apps

Platform