Getting started with Python LanguageList comprehensionsFilterListFunctionsDecoratorsMath ModuleLoopsRandom moduleComparisonsImporting modulesSorting, Minimum and MaximumOperator moduleVariable Scope and BindingBasic Input and OutputFiles & Folders I/OJSON ModuleString MethodsMetaclassesIndexing and SlicingGeneratorsSimple Mathematical OperatorsReduceMap FunctionExponentiationSearchingDictionaryClassesCountingManipulating XMLDate and TimeSetCollections moduleParallel computationMultithreadingWriting extensionsUnit TestingRegular Expressions (Regex)Bitwise OperatorsIncompatibilities moving from Python 2 to Python 3Virtual environmentsCopying dataTupleContext Managers (“with” Statement)Hidden FeaturesEnumString FormattingConditionalsComplex mathUnicode and bytesThe __name__ special variableChecking Path Existence and PermissionsPython NetworkingAsyncio ModuleThe Print Functionos.pathCreating Python packagesParsing Command Line argumentsHTML ParsingSubprocess Librarysetup.pyList slicing (selecting parts of lists)SocketsItertools ModuleRecursionBoolean OperatorsThe dis moduleType Hintspip: PyPI Package ManagerThe locale ModuleExceptionsWeb scraping with PythonDeque ModuleDistributionProperty ObjectsOverloadingDebuggingReading and Writing CSVDynamic code execution with `exec` and `eval`PyInstaller - Distributing Python CodeIterables and IteratorsData Visualization with PythonThe Interpreter (Command Line Console)*args and **kwargsFunctools ModuleGarbage CollectionIndentationSecurity and CryptographyPickle data serialisationurllibBinary DataPython and ExcelIdiomsMethod OverridingDifference between Module and PackageData SerializationPython concurrencyIntroduction to RabbitMQ using AMQPStormPostgreSQLDescriptorCommon PitfallsMultiprocessingtempfile NamedTemporaryFileWorking with ZIP archivesStackProfilingUser-Defined MethodsWorking around the Global Interpreter Lock (GIL)DeploymentLoggingProcesses and ThreadsThe os ModuleComments and DocumentationDatabase AccessPython HTTP ServerAlternatives to switch statement from other languagesList destructuring (aka packing and unpacking)Accessing Python source code and bytecodeMixinsAttribute AccessArcPyPython Anti-PatternsPlugin and Extension ClassesWebsocketsImmutable datatypes(int, float, str, tuple and frozensets)String representations of class instances: __str__ and __repr__ methodsArraysOperator PrecedencePolymorphismNon-official Python implementationsList ComprehensionsWeb Server Gateway Interface (WSGI)2to3 toolAbstract syntax treeAbstract Base Classes (abc)UnicodeSecure Shell Connection in PythonPython Serial Communication (pyserial)Neo4j and Cypher using Py2NeoBasic Curses with PythonPerformance optimizationTemplates in pythonPillowThe pass statementLinked List Nodepy.testDate FormattingHeapqtkinterCLI subcommands with precise help outputDefining functions with list argumentsSqlite3 ModulePython PersistenceTurtle GraphicsConnecting Python to SQL ServerDesign PatternsMultidimensional arraysAudioPygletQueue ModuleijsonWebbrowser ModuleThe base64 ModuleFlaskgroupby()Sockets And Message Encryption/Decryption Between Client and ServerpygameInput, Subset and Output External Data Files using Pandashashlibgetting start with GZipDjangoctypesCreating a Windows service using PythonPython Server Sent EventsMutable vs Immutable (and Hashable) in PythonPython speed of programconfigparserLinked listsCommonwealth ExceptionsOptical Character RecognitionPython Data TypesPartial functionspyautogui modulegraph-toolUnzipping FilesFunctional Programming in PythonPython Virtual Environment - virtualenvsysvirtual environment with virtualenvwrapperCreate virtual environment with virtualenvwrapper in windowsPython Requests PostPlotting with MatplotlibPython Lex-YaccChemPy - python packagepyaudioshelveUsage of "pip" module: PyPI Package ManagerIoT Programming with Python and Raspberry PICode blocks, execution frames, and namespaceskivy - Cross-platform Python Framework for NUI DevelopmentCall Python from C#Similarities in syntax, Differences in meaning: Python vs. JavaScriptWriting to CSV from String or ListRaise Custom Errors / ExceptionsUsing loops within functionsPandas Transform: Preform operations on groups and concatenate the results

Security and Cryptography

Other topics

Remarks:

Many of the methods in hashlib will require you to pass values interpretable as buffers of bytes, rather than strings. This is the case for hashlib.new().update() as well as hashlib.pbkdf2_hmac. If you have a string, you can convert it to a byte buffer by prepending the character b to the start of the string:

  "This is a string"
 b"This is a buffer of bytes"

Calculating a Message Digest

The hashlib module allows creating message digest generators via the new method. These generators will turn an arbitrary string into a fixed-length digest:

import hashlib

h = hashlib.new('sha256')
h.update(b'Nobody expects the Spanish Inquisition.')
h.digest()
# ==> b'.\xdf\xda\xdaVR[\x12\x90\xff\x16\xfb\x17D\xcf\xb4\x82\xdd)\x14\xff\xbc\xb6Iy\x0c\x0eX\x9eF-='

Note that you can call update an arbitrary number of times before calling digest which is useful to hash a large file chunk by chunk. You can also get the digest in hexadecimal format by using hexdigest:

h.hexdigest()
# ==> '2edfdada56525b1290ff16fb1744cfb482dd2914ffbcb649790c0e589e462d3d'

Available Hashing Algorithms

hashlib.new requires the name of an algorithm when you call it to produce a generator. To find out what algorithms are available in the current Python interpreter, use hashlib.algorithms_available:

import hashlib
hashlib.algorithms_available
# ==> {'sha256', 'DSA-SHA', 'SHA512', 'SHA224', 'dsaWithSHA', 'SHA', 'RIPEMD160', 'ecdsa-with-SHA1', 'sha1', 'SHA384', 'md5', 'SHA1', 'MD5', 'MD4', 'SHA256', 'sha384', 'md4', 'ripemd160', 'sha224', 'sha512', 'DSA', 'dsaEncryption', 'sha', 'whirlpool'}

The returned list will vary according to platform and interpreter; make sure you check your algorithm is available.

There are also some algorithms that are guaranteed to be available on all platforms and interpreters, which are available using hashlib.algorithms_guaranteed:

hashlib.algorithms_guaranteed
# ==> {'sha256', 'sha384', 'sha1', 'sha224', 'md5', 'sha512'}

Secure Password Hashing

The PBKDF2 algorithm exposed by hashlib module can be used to perform secure password hashing. While this algorithm cannot prevent brute-force attacks in order to recover the original password from the stored hash, it makes such attacks very expensive.

import hashlib
import os

salt = os.urandom(16)
hash = hashlib.pbkdf2_hmac('sha256', b'password', salt, 100000)

PBKDF2 can work with any digest algorithm, the above example uses SHA256 which is usually recommended. The random salt should be stored along with the hashed password, you will need it again in order to compare an entered password to the stored hash. It is essential that each password is hashed with a different salt. As to the number of rounds, it is recommended to set it as high as possible for your application.

If you want the result in hexadecimal, you can use the binascii module:

import binascii
hexhash = binascii.hexlify(hash)

Note: While PBKDF2 isn't bad, bcrypt and especially scrypt are considered stronger against brute-force attacks. Neither is part of the Python standard library at the moment.

File Hashing

A hash is a function that converts a variable length sequence of bytes to a fixed length sequence. Hashing files can be advantageous for many reasons. Hashes can be used to check if two files are identical or verify that the contents of a file haven't been corrupted or changed.

You can use hashlib to generate a hash for a file:

import hashlib

hasher = hashlib.new('sha256')
with open('myfile', 'r') as f:
    contents = f.read()
    hasher.update(contents)

print hasher.hexdigest() 

For larger files, a buffer of fixed length can be used:

import hashlib
SIZE = 65536
hasher = hashlib.new('sha256')
with open('myfile', 'r') as f:
    buffer = f.read(SIZE)
    while len(buffer) > 0:
        hasher.update(buffer)
        buffer = f.read(SIZE)
print(hasher.hexdigest())

Symmetric encryption using pycrypto

Python's built-in crypto functionality is currently limited to hashing. Encryption requires a third-party module like pycrypto. For example, it provides the AES algorithm which is considered state of the art for symmetric encryption. The following code will encrypt a given message using a passphrase:

import hashlib
import math
import os

from Crypto.Cipher import AES

IV_SIZE = 16    # 128 bit, fixed for the AES algorithm
KEY_SIZE = 32   # 256 bit meaning AES-256, can also be 128 or 192 bits
SALT_SIZE = 16  # This size is arbitrary

cleartext = b'Lorem ipsum'
password = b'highly secure encryption password'
salt = os.urandom(SALT_SIZE)
derived = hashlib.pbkdf2_hmac('sha256', password, salt, 100000,
                              dklen=IV_SIZE + KEY_SIZE)
iv = derived[0:IV_SIZE]
key = derived[IV_SIZE:]

encrypted = salt + AES.new(key, AES.MODE_CFB, iv).encrypt(cleartext)

The AES algorithm takes three parameters: encryption key, initialization vector (IV) and the actual message to be encrypted. If you have a randomly generated AES key then you can use that one directly and merely generate a random initialization vector. A passphrase doesn't have the right size however, nor would it be recommendable to use it directly given that it isn't truly random and thus has comparably little entropy. Instead, we use the built-in implementation of the PBKDF2 algorithm to generate a 128 bit initialization vector and 256 bit encryption key from the password.

Note the random salt which is important to have a different initialization vector and key for each message encrypted. This ensures in particular that two equal messages won't result in identical encrypted text, but it also prevents attackers from reusing work spent guessing one passphrase on messages encrypted with another passphrase. This salt has to be stored along with the encrypted message in order to derive the same initialization vector and key for decrypting.

The following code will decrypt our message again:

salt = encrypted[0:SALT_SIZE]
derived = hashlib.pbkdf2_hmac('sha256', password, salt, 100000,
                              dklen=IV_SIZE + KEY_SIZE)
iv = derived[0:IV_SIZE]
key = derived[IV_SIZE:]
cleartext = AES.new(key, AES.MODE_CFB, iv).decrypt(encrypted[SALT_SIZE:])

Generating RSA signatures using pycrypto

RSA can be used to create a message signature. A valid signature can only be generated with access to the private RSA key, validating on the other hand is possible with merely the corresponding public key. So as long as the other side knows your public key they can verify the message to be signed by you and unchanged - an approach used for email for example. Currently, a third-party module like pycrypto is required for this functionality.

import errno

from Crypto.Hash import SHA256
from Crypto.PublicKey import RSA
from Crypto.Signature import PKCS1_v1_5

message = b'This message is from me, I promise.'

try:
    with open('privkey.pem', 'r') as f:
        key = RSA.importKey(f.read())
except IOError as e:
    if e.errno != errno.ENOENT:
        raise
    # No private key, generate a new one. This can take a few seconds.
    key = RSA.generate(4096)
    with open('privkey.pem', 'wb') as f:
        f.write(key.exportKey('PEM'))
    with open('pubkey.pem', 'wb') as f:
        f.write(key.publickey().exportKey('PEM'))

hasher = SHA256.new(message)
signer = PKCS1_v1_5.new(key)
signature = signer.sign(hasher)

Verifying the signature works similarly but uses the public key rather than the private key:

with open('pubkey.pem', 'rb') as f:
    key = RSA.importKey(f.read())
hasher = SHA256.new(message)
verifier = PKCS1_v1_5.new(key)
if verifier.verify(hasher, signature):
    print('Nice, the signature is valid!')
else:
    print('No, the message was signed with the wrong private key or modified')

Note: The above examples use PKCS#1 v1.5 signing algorithm which is very common. pycrypto also implements the newer PKCS#1 PSS algorithm, replacing PKCS1_v1_5 by PKCS1_PSS in the examples should work if you want to use that one. Currently there seems to be little reason to use it however.

Asymmetric RSA encryption using pycrypto

Asymmetric encryption has the advantage that a message can be encrypted without exchanging a secret key with the recipient of the message. The sender merely needs to know the recipients public key, this allows encrypting the message in such a way that only the designated recipient (who has the corresponding private key) can decrypt it. Currently, a third-party module like pycrypto is required for this functionality.

from Crypto.Cipher import PKCS1_OAEP
from Crypto.PublicKey import RSA

message = b'This is a very secret message.'

with open('pubkey.pem', 'rb') as f:
    key = RSA.importKey(f.read())
cipher = PKCS1_OAEP.new(key)
encrypted = cipher.encrypt(message)

The recipient can decrypt the message then if they have the right private key:

with open('privkey.pem', 'rb') as f:
    key = RSA.importKey(f.read())
cipher = PKCS1_OAEP.new(key)
decrypted = cipher.decrypt(encrypted)

Note: The above examples use PKCS#1 OAEP encryption scheme. pycrypto also implements PKCS#1 v1.5 encryption scheme, this one is not recommended for new protocols however due to known caveats.

Syntax:

  • hashlib.new(name)
  • hashlib.pbkdf2_hmac(name, password, salt, rounds, dklen=None)

Contributors

Topic Id: 2598

Example Ids: 8620,8621,8622,9666,18926,19025,19026

This site is not affiliated with any of the contributors.