Picka: A Python module for data generation and randomization.

Related tags

Data Analysispicka
Overview

Picka: A Python module for data generation and randomization.

Author: Anthony Long
Version: 1.0.1 - Fixed the broken image stuff. Whoops

What is Picka?

Picka generates randomized data for testing.

Data is generated both from a database of known good data (which is included), or by generating realistic data (valid), using string formatting (behind the scenes).

Picka has a function for any field you would need filled in. With selenium, something like would populate the "field-name-here" box for you, 100 times with random names.

for x in xrange(101):
        self.selenium.type('field-name-here', picka.male_name())

But this is just the beginning. Other ways to implement this, include using dicts:

user_information = {
        "first_name": picka.male_name(),
        "last_name": picka.last_name(),
        "email_address": picka.email(10, extension='example.org'),
        "password": picka.password_numerical(6),
}

This would provide:

{
        "first_name": "Jack",
        "last_name": "Logan",
        "email_address": "[email protected]",
        "password": "485444"
}

Don't forget, since all of the data is considered "clean" or valid - you can also use it to fill selects and other form fields with pre-defined values. For example, if you were to generate a state; picka.state() the result would be "Alabama". You can use this result to directly select a state in an address drop-down box.

Examples:

Selenium

def search_for_garbage():
        selenium.open('http://yahoo.com')
        selenium.type('id=search_box', picka.random_string(10))
        selenium.submit()

def test_search_for_garbage_results():
        search_for_garbage()
        selenium.wait_for_page_to_load('30000')
        assert selenium.get_xpath_count('id=results') == 0

Webdriver

driver = webdriver.Firefox()
driver.get("http://somesite.com")
x = {
        "name": [
                "#name",
                picka.name()
        ]
}
driver.find_element_by_css_selector(
        x["name"][0]).send_keys(x["name"][1]
)

Funcargs / pytest

def pytest_generate_tests(metafunc):
        if "test_string" in metafunc.funcargnames:
                for i in range(10):
                        metafunc.addcall(funcargs=dict(numiter=picka.random_string(20)))

def test_func(test_string):
        assert test_string.isalpha()
        assert len(test_string) == 20

MySQL / SQLite

first, last, age = picka.first_name(), picka.last_name(), picka.age()
cursor.execute(
   "insert into user_data (first_name, last_name, age) VALUES (?, ?, ?)",
   (first, last, age)
)

HTTP

def post(host, data):
        http = httplib.HTTP(host)
        return http.send(data)

def test_post_result():
        post("www.spam.egg/bacon.htm", picka.random_string(10))
Comments
  • No test suite

    No test suite

    Slightly ironic, a test data generation toolkit which doesnt have a test suite.

    Also setup.py doesnt declare Python 3 support, hence the need for a test suite to validate it works correctly.

    opened by jayvdb 1
  • Additional Functionality for Testers to Add Their Own Data

    Additional Functionality for Testers to Add Their Own Data

    Picka provides general data for testing. Leveraging this effort provides custom test data. Test data is not limited to just preconfigured values when it's possible to add custom test data. Data can be accessed sequentially, randomly or completely.

    opened by bkuehlhorn 1
  • Fixed test file, added alternative sentence maker

    Fixed test file, added alternative sentence maker

    1. Fixed usage of number in tests (it takes one arg, not two)
    2. Added sentence_actual, which returns an actual sentence from the Sherlock text.
    3. Added _picka._Book class to hold the text and split sentences read from Sherlock. Users can call sentence() without reading the entire file again and again.
    4. Added test of sentence_actual to picka.tests

    The sentence_actual function has some nice features:

    1. You're much less likely to get a sentence fragment
    2. You can specify a minimum and maximum number of words
    3. It should be relatively efficient, because the split sentences are cached by the _Book class.

    The sentences aren't always perfect, but I think that has to do with the source. A book other than Sherlock Holmes, preferably one with less dialog, would give more "normal" sentences.

    opened by TadLeonard 1
  • Library does not take locale into account

    Library does not take locale into account

    The library assumes an English locale is used (e.g., English-language hardcoded month names). Ideally the library would use locale-dependent constants so that computations are done correctly (e.g., the duration of a month in month_and_day):

    >>> locale.setlocale(locale.LC_ALL, 'it_IT')
    'it_IT'
    >>> picka.month()
    'Marzo'
    >>> picka.month_and_day()
    'Maggio 2'
    
    opened by svisser 0
  • picka.age will return ages outside of the bounds

    picka.age will return ages outside of the bounds

    If I call picka.age(1, 1) repeatedly I get 1 and 2 as results. I would have expected it to always return 1. Note that this situation can occur when passing variables to picka.age, I don't expect people to write this in their code themselves.

    I can also get ages outside of the bounds when I call picka.age(0, 1) which resorts to using the default values and can therefore return any age within the default values.

    opened by svisser 0
  • Module name means

    Module name means "cunt"

    I'm not sure if this is a real issue, but when I look at this module I cannot do so with a straight face. "Picka" is "cunt" in Serbian, Macedonian, Bosnian, Croatian, and I'm unsure as to whether there are other languages where this holds.

    While not grounds for any specific action, I find this largely amusing and just wanted to share.

    opened by geomaster 2
Releases(v0.96)
Candlestick Pattern Recognition with Python and TA-Lib

Candlestick-Pattern-Recognition-with-Python-and-TA-Lib Goal Look at the S&P500 to try and get a better understanding of these candlestick patterns and

Ganesh Jainarain 11 Oct 07, 2022
PyStan, a Python interface to Stan, a platform for statistical modeling. Documentation: https://pystan.readthedocs.io

PyStan PyStan is a Python interface to Stan, a package for Bayesian inference. Stan® is a state-of-the-art platform for statistical modeling and high-

Stan 229 Dec 29, 2022
Spectacular AI SDK fuses data from cameras and IMU sensors and outputs an accurate 6-degree-of-freedom pose of a device.

Spectacular AI SDK examples Spectacular AI SDK fuses data from cameras and IMU sensors (accelerometer and gyroscope) and outputs an accurate 6-degree-

Spectacular AI 94 Jan 04, 2023
International Space Station data with Python research 🌎

International Space Station data with Python research 🌎 Plotting ISS trajectory, calculating the velocity over the earth and more. Plotting trajector

Facundo Pedaccio 41 Jun 16, 2022
Scraping and analysis of leetcode-compensations page.

Leetcode compensations report Scraping and analysis of leetcode-compensations page.

utsav 96 Jan 01, 2023
Two phase pipeline + StreamlitTwo phase pipeline + Streamlit

Two phase pipeline + Streamlit This is an example project that demonstrates how to create a pipeline that consists of two phases of execution. In betw

Rick Lamers 1 Nov 17, 2021
Yet Another Workflow Parser for SecurityHub

YAWPS Yet Another Workflow Parser for SecurityHub "Screaming pepper" by Rum Bucolic Ape is licensed with CC BY-ND 2.0. To view a copy of this license,

myoung34 8 Dec 22, 2022
A library to create multi-page Streamlit applications with ease.

A library to create multi-page Streamlit applications with ease.

Jackson Storm 107 Jan 04, 2023
Pipeline and Dataset helpers for complex algorithm evaluation.

tpcp - Tiny Pipelines for Complex Problems A generic way to build object-oriented datasets and algorithm pipelines and tools to evaluate them pip inst

Machine Learning and Data Analytics Lab FAU 3 Dec 07, 2022
Python library for creating data pipelines with chain functional programming

PyFunctional Features PyFunctional makes creating data pipelines easy by using chained functional operators. Here are a few examples of what it can do

Pedro Rodriguez 2.1k Jan 05, 2023
Tools for the analysis, simulation, and presentation of Lorentz TEM data.

ltempy ltempy is a set of tools for Lorentz TEM data analysis, simulation, and presentation. Features Single Image Transport of Intensity Equation (SI

McMorran Lab 1 Dec 26, 2022
Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code

Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tuplex has similar Python APIs to Apache Spark or Dask, but rather

Tuplex 791 Jan 04, 2023
Elasticsearch tool for easily collecting and batch inserting Python data and pandas DataFrames

ElasticBatch Elasticsearch buffer for collecting and batch inserting Python data and pandas DataFrames Overview ElasticBatch makes it easy to efficien

Dan Kaslovsky 21 Mar 16, 2022
Finds, downloads, parses, and standardizes public bikeshare data into a standard pandas dataframe format

Finds, downloads, parses, and standardizes public bikeshare data into a standard pandas dataframe format.

Brady Law 2 Dec 01, 2021
a tool that compiles a csv of all h1 program stats

h1stats - h1 Program Stats Scraper This python3 script will call out to HackerOne's graphql API and scrape all currently active programs for informati

Evan 40 Oct 27, 2022
Pypeln is a simple yet powerful Python library for creating concurrent data pipelines.

Pypeln Pypeln (pronounced as "pypeline") is a simple yet powerful Python library for creating concurrent data pipelines. Main Features Simple: Pypeln

Cristian Garcia 1.4k Dec 31, 2022
Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python

Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python 📊

Thomas 2 May 26, 2022
Data and code accompanying the paper Politics and Virality in the Time of Twitter

Politics and Virality in the Time of Twitter Data and code accompanying the paper Politics and Virality in the Time of Twitter. In specific: the code

Cardiff NLP 3 Jul 02, 2022
Data Analytics: Modeling and Studying data relating to climate change and adoption of electric vehicles

Correlation-Study-Climate-Change-EV-Adoption Data Analytics: Modeling and Studying data relating to climate change and adoption of electric vehicles I

Jonathan Feng 1 Jan 03, 2022
Implementation in Python of the reliability measures such as Omega.

reliabiliPy Summary Simple implementation in Python of the [reliability](https://en.wikipedia.org/wiki/Reliability_(statistics) measures for surveys:

Rafael Valero Fernández 2 Apr 27, 2022