Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding delete #236

Merged
merged 20 commits into from
May 21, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -176,6 +176,21 @@ will be added to the existing host entry.
If the canonical facts based lookup does not locate an existing host, then
a new host entry is created.

#### Host deletion

Hosts can be deleted by using the DELETE HTTP Method on the _/hosts/id_ endpoint.
When a host is deleted, the inventory service will send an event message
to the _platform.inventory.events_ message queue. The delete event message
will look like the following:

```json
{"id": <host id>, "timestamp": <delete timestamp>, "type": "delete"}
```

- type: type of host change (delete in this case)
- id: Inventory host id of the host that was deleted
- timestamp: the time at which the host was deleted

#### Testing API Calls

It is necessary to pass an authentication header along on each call to the
Expand Down
34 changes: 30 additions & 4 deletions api/host.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,13 @@
from flask_api import status
from marshmallow import ValidationError

from app import db
from app import db, events
from app.models import Host, HostSchema, PatchHostSchema
from app.auth import current_identity
from app.exceptions import InventoryException
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is unrelated to the DELETE operation.

from app.logging import get_logger
from api import api_operation, metrics
from tasks import emit_event


TAG_OPERATIONS = ("apply", "remove")
Expand Down Expand Up @@ -244,6 +245,26 @@ def find_hosts_by_hostname_or_id(account_number, hostname):
sqlalchemy.or_(*filter_list)]))


@api_operation
@metrics.api_request_time.time()
def delete_by_id(host_id_list):
query = _get_host_list_by_id_list(
current_identity.account_number, host_id_list, order=False
)

hosts = query.all()

if not hosts:
return flask.abort(status.HTTP_404_NOT_FOUND)

with metrics.delete_host_processing_time.time():
query.delete(synchronize_session="fetch")
db.session.commit()
metrics.delete_host_count.inc(len(hosts))
for deleted_host in hosts:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually doesn’t work The hosts variable is a BaseQuery, not a list. This query is used just a few lines earlier to delete the hosts. Then here, iterating over the hosts run a SELECT query, which doesn’t find anything, because the hosts are already deleted.

@api_operation
@metrics.api_request_time.time()
def delete_by_id(host_id_list):
    query = _get_host_list_by_id_list(
        current_identity.account_number, host_id_list, order=False
    )
    hosts = query.all()
    with metrics.delete_host_processing_time.time():
        query.delete(synchronize_session="fetch")
    db.session.commit()
    metrics.delete_host_count.inc(len(hosts))
    for deleted_host in hosts:
        emit_event(events.delete(deleted_host.id))

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fixed.

emit_event(events.delete(deleted_host.id))

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no return statement. The API specification specifies though, that this operations should return 404 Not Found if the host is not found. This is currently not true: 200 OK is returned automatically, but nothing is deleted.

Because this can operate on more than one host, I suggest using 207 Multi-Status instead, just at add_host_list does. Like that it’d be possible to return more statūs, one for each records. It can happen that some hosts are found (and deleted) and some are not.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, I do not think adding the 207 multi-status is required/needed. We can add it later on if we find that it is needed.


@api_operation
@metrics.api_request_time.time()
def get_host_by_id(host_id_list, page=1, per_page=100):
Expand All @@ -259,11 +280,16 @@ def get_host_by_id(host_id_list, page=1, per_page=100):
)


def _get_host_list_by_id_list(account_number, host_id_list):
return Host.query.filter(
def _get_host_list_by_id_list(account_number, host_id_list, order=True):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking whether we couldn’t just remove the order_by from this function entirely and use it in the respective places. E.g. the patch operation doesn’t need the ordering at all. The oder operations (get_host_by_id and get_host_system_profile_by_id) use paginate, which is the reason for the ordering. So although repeating the oder_by wouldn’t be dry, it would fit nicely together. But this is not necessary to edit in this PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine for now.

q = Host.query.filter(
(Host.account == account_number)
& Host.id.in_(host_id_list)
).order_by(Host.created_on, Host.id)
)

if order:
return q.order_by(Host.created_on, Host.id)
else:
return q


@api_operation
Expand Down
4 changes: 4 additions & 0 deletions api/metrics.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,10 @@
"The total amount of hosts created")
update_host_count = Counter("inventory_update_host_count",
"The total amount of hosts updated")
delete_host_count = Counter("inventory_delete_host_count",
"The total amount of hosts deleted")
delete_host_processing_time = Summary("inventory_delete_host_commit_seconds",
"Time spent deleting hosts from the database")
login_failure_count = Counter("inventory_login_failure_count",
"The total amount of failed login attempts")
system_profile_deserialization_time = Summary("inventory_system_profile_deserialization_time",
Expand Down
9 changes: 4 additions & 5 deletions app/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
import os
import connexion
import yaml

Expand All @@ -11,7 +10,7 @@
from app.exceptions import InventoryException
from app.logging import configure_logging, threadctx
from app.validators import verify_uuid_format # noqa: 401
from tasks import start_consumer
from tasks import init_tasks

REQUEST_ID_HEADER = "x-rh-insights-request-id"
UNKNOWN_REQUEST_ID_VALUE = "-1"
Expand All @@ -30,7 +29,8 @@ def create_app(config_name):
# needs to be setup before the flask app is initialized.
configure_logging(config_name)

app_config = Config(config_name)
app_config = Config()
app_config.log_configuration(config_name)

connexion_app = connexion.App(
"inventory", specification_dir="./swagger/", options=connexion_options
Expand Down Expand Up @@ -75,7 +75,6 @@ def set_request_id():
REQUEST_ID_HEADER,
UNKNOWN_REQUEST_ID_VALUE)

if all(map(os.environ.get, ["KAFKA_TOPIC", "KAFKA_GROUP", "KAFKA_BOOTSTRAP_SERVERS"])):
start_consumer(flask_app)
init_tasks(app_config, flask_app)

return flask_app
13 changes: 8 additions & 5 deletions app/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,8 @@


class Config:
def __init__(self, config_name):
def __init__(self):
self.logger = get_logger(__name__)
self._config_name = config_name

self._db_user = os.getenv("INVENTORY_DB_USER", "insights")
self._db_password = os.getenv("INVENTORY_DB_PASS", "insights")
Expand All @@ -25,7 +24,11 @@ def __init__(self, config_name):

self.api_urls = [self.api_url_path_prefix, self.legacy_api_url_path_prefix]

self._log_configuration()
self.system_profile_topic = os.environ.get("KAFKA_TOPIC", "platform.system-profile")
self.consumer_group = os.environ.get("KAFKA_GROUP", "inventory")
self.bootstrap_servers = os.environ.get("KAFKA_BOOTSTRAP_SERVERS", "kafka:29092")
self.event_topic = os.environ.get("KAFKA_EVENT_TOPIC", "platform.inventory.events")
self.kafka_enabled = all(map(os.environ.get, ["KAFKA_TOPIC", "KAFKA_GROUP", "KAFKA_BOOTSTRAP_SERVERS"]))

def _build_base_url_path(self):
app_name = os.getenv("APP_NAME", "inventory")
Expand All @@ -39,8 +42,8 @@ def _build_api_path(self):
api_path = f"{base_url_path}/{version}"
return api_path

def _log_configuration(self):
if self._config_name != "testing":
def log_configuration(self, config_name):
if config_name != "testing":
self.logger.info("Insights Host Inventory Configuration:")
self.logger.info("Build Version: %s" % get_build_version())
self.logger.info("API URL Path: %s" % self.api_url_path_prefix)
Expand Down
17 changes: 17 additions & 0 deletions app/events.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
import logging
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I like putting the event schema into a separate module instead of the ever growing app.models one, it’s currently unsystematic. The app.models module already contains many schemas that aren’t models at all, e.g. the PatchHostSchema. I’d keep everything in that bulky module for now and split it later separately.

The delete method can be for now put inline into the delete_by_id operation. It’s not very nice, but it’s how we currently do it. The HTTP responses, errors and everything are composed there too. I’d rather have the code consistent than to see many first steps to various refactorings.

from datetime import datetime
from marshmallow import Schema, fields

logger = logging.getLogger(__name__)


class HostEvent(Schema):
id = fields.UUID()
timestamp = fields.DateTime(format="iso8601")
type = fields.Str()


def delete(id):
return HostEvent(strict=True).dumps(
{"id": id, "timestamp": datetime.utcnow(), "type": "delete"}
).data
16 changes: 16 additions & 0 deletions dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,19 @@ services:
POSTGRES_DB: insights
ports:
- "5432:5432"
zookeeper:
image: confluentinc/cp-zookeeper
environment:
- ZOOKEEPER_CLIENT_PORT=32181
- ZOOKEEPER_SERVER_ID=1
kafka:
image: confluentinc/cp-kafka
ports:
- 29092:29092
depends_on:
- zookeeper
environment:
- KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://kafka:29092
- KAFKA_BROKER_ID=1
- KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1
- KAFKA_ZOOKEEPER_CONNECT=zookeeper:32181
2 changes: 1 addition & 1 deletion logconfig.ini
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ propagate=0
qualname=inventory.api

[logger_tasks]
level=INFO
level=DEBUG
handlers=logstash
propagate=0
qualname=inventory.tasks
Expand Down
25 changes: 25 additions & 0 deletions swagger/api.spec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,31 @@ paths:
description: Invalid request.
'404':
description: Host not found.
delete:
tags:
- hosts
summary: Delete hosts by IDs
description: Delete hosts by IDs
operationId: api.host.delete_by_id
security:
- ApiKeyAuth: []
parameters:
- in: path
name: host_id_list
description: A comma separated list of host IDs.
required: true
schema:
type: array
items:
type: string
format: uuid
responses:
'200':
description: Successfully deleted hosts.
'400':
description: Invalid request.
'404':
description: Host not found.
patch:
tags:
- hosts
Expand Down
50 changes: 42 additions & 8 deletions tasks/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import os
import json
from kafka import KafkaConsumer
from kafka import KafkaProducer
from threading import Thread

from api import metrics
Expand All @@ -10,9 +10,39 @@

logger = get_logger(__name__)

TOPIC = os.environ.get("KAFKA_TOPIC", "platform.system-profile")
KAFKA_GROUP = os.environ.get("KAFKA_GROUP", "inventory")
BOOTSTRAP_SERVERS = os.environ.get("KAFKA_BOOTSTRAP_SERVERS", "kafka:29092")

class NullProducer:

def send(self, topic, value=None):
logger.debug("NullProducer - logging message: topic (%s) - message: %s" %
(topic, value))


producer = None
cfg = None


def init_tasks(config, flask_app):
global cfg
global producer

cfg = config

producer = _init_event_producer(config)
_init_system_profile_consumer(config, flask_app)


def _init_event_producer(config):
if config.kafka_enabled:
logger.info("Starting KafkaProducer()")
return KafkaProducer(bootstrap_servers=config.bootstrap_servers)
else:
logger.info("Starting NullProducer()")
return NullProducer()


def emit_event(e):
producer.send(cfg.event_topic, value=e.encode("utf-8"))


@metrics.system_profile_commit_processing_time.time()
Expand All @@ -32,15 +62,19 @@ def msg_handler(parsed):
db.session.commit()


def start_consumer(flask_app, handler=msg_handler, consumer=None):
def _init_system_profile_consumer(config, flask_app, handler=msg_handler, consumer=None):

if not config.kafka_enabled:
logger.info("System profile consumer has been disabled")
return

logger.info("Starting system profile queue consumer.")

if consumer is None:
consumer = KafkaConsumer(
TOPIC,
group_id=KAFKA_GROUP,
bootstrap_servers=BOOTSTRAP_SERVERS)
config.system_profile_topic,
group_id=config.consumer_group,
bootstrap_servers=config.bootstrap_servers)

def _f():
with flask_app.app_context():
Expand Down
51 changes: 50 additions & 1 deletion test_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
import uuid
import copy
import tempfile
from app import create_app, db
from app import create_app, db, events
from app.auth.identity import Identity
from app.utils import HostWrapper
from base64 import b64encode
Expand Down Expand Up @@ -101,6 +101,13 @@ def put(self, path, data, status=200, return_response_as_json=True):
self.client().put, path, data, status, return_response_as_json
)

def delete(self, path, status=200, return_response_as_json=True):
return self._response_check(
self.client().delete(path, headers=self._get_valid_auth_header()),
status,
return_response_as_json,
)

def verify_error_response(self, response, expected_title=None,
expected_status=None, expected_detail=None,
expected_type=None):
Expand Down Expand Up @@ -1298,6 +1305,47 @@ def test_invalid_data(self):
expected_status=400)


class DeleteHostsTestCase(PreCreatedHostsBaseTestCase):

def test_create_then_delete(self):
original_id = self.added_hosts[0].id

url = HOST_URL + "/" + original_id

# Get the host
self.get(url, 200)

class MockEmitEvent:

def __init__(self):
self.events = []

def __call__(self, e):
self.events.append(e)

# Delete the host
with unittest.mock.patch("api.host.emit_event", new=MockEmitEvent()) as m:
self.delete(url, 200, return_response_as_json=False)
assert original_id in m.events[0]

# Try to get the host again
response = self.get(url, 200)

self.assertEqual(response["count"], 0)
self.assertEqual(response["total"], 0)
self.assertEqual(response["results"], [])

def test_delete_non_existent_host(self):
url = HOST_URL + "/" + generate_uuid()

self.delete(url, 404)

def test_delete_with_invalid_host_id(self):
url = HOST_URL + "/" + "notauuid"

self.delete(url, 400)


class QueryTestCase(PreCreatedHostsBaseTestCase):
def test_query_all(self):
response = self.get(HOST_URL, 200)
Expand Down Expand Up @@ -1756,5 +1804,6 @@ def test_version(self):
response = self.get(VERSION_URL, 200)
assert response['version'] is not None


if __name__ == "__main__":
unittest.main()
Loading