Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: client.list() returns empty list for windows nt ftp server #104

Open
Andrei-Pozolotin opened this issue Dec 27, 2019 · 4 comments
Open

Comments

@Andrei-Pozolotin
Copy link

  1. for example,
  • ftp://ftp.nasdaqtrader.com
  1. for the following snippet:
    client.list() returns empty list for windows nt ftp server
import aioftp
import asyncio
from urllib.parse import urlparse


async def ftp_list(remote_url:str):
    remote_bag = urlparse(remote_url)
    ftp_host = remote_bag.hostname
    ftp_port = remote_bag.port or aioftp.DEFAULT_PORT
    ftp_user = remote_bag.username or "anonymous"
    ftp_pass = remote_bag.password or "[email protected]"
    session = aioftp.ClientSession(
        host=ftp_host, port=ftp_port, user=ftp_user, password=ftp_pass,
    )
    async with session as client:
        entry_list = await client.list(path="/")
        print(entry_list)
        for path, info in entry_list:
            print(path, info)


remote_url = "ftp://ftp.nasdaqtrader.com"
asyncio.run(ftp_list(remote_url))
  1. server help shows no MLSD or LIST assumed by aioftp:
ftp> help
Commands may be abbreviated.  Commands are:

!               dir             macdef          proxy           site
$               disconnect      mdelete         sendport        size
account         epsv4           mdir            put             status
append          form            mget            pwd             struct
ascii           get             mkdir           quit            system
bell            glob            mls             quote           sunique
binary          hash            mode            recv            tenex
bye             help            modtime         reget           trace
case            idle            mput            rstatus         type
cd              image           newer           rhelp           user
cdup            ipany           nmap            rename          umask
chmod           ipv4            nlist           reset           verbose
close           ipv6            ntrans          restart         ?
cr              lcd             open            rmdir
delete          lpwd            passive         runique
debug           ls              prompt          send
  1. actual list shown in web browser:
Name | Size | Date Modified
-- | -- | --
aspnet_client/ |   | 9/12/12, 7:06:00 AM
atsactivity/ |   | 9/12/12, 3:52:00 AM
ClosingCross/ |   | 1/29/08, 2:08:00 AM
Downloads/ |   | 1/29/08, 2:08:00 AM
ETFData/ |   | 1/29/08, 2:08:00 AM
MonthlyShareVolume/ |   | 1/29/08, 2:08:00 AM
OpeningCross/ |   | 1/29/08, 2:08:00 AM
OrderExecutionQuality/ |   | 6/30/10, 8:29:00 AM
OrderExecutionQualityBX/ |   | 6/30/10, 8:29:00 AM
OrderExecutionQualityPSX/ |   | 11/30/10, 8:44:00 AM
phlx/ |   | 9/23/08, 2:34:00 PM
SymbolDirectory/ |   | 9/12/12, 4:22:00 AM
  1. similar code / url works fine when using ftplib
    https://docs.python.org/3/library/ftplib.html
@pohmelie
Copy link
Collaborator

First of all, this server do not support MLSx commands. You can ensure this with logging.basicConfig(level=logging.DEBUG) before your code.

DEBUG:asyncio:Using selector: EpollSelector
INFO:aioftp.client:220
INFO:aioftp.client:USER anonymous
INFO:aioftp.client:331 Anonymous access allowed, send identity (e-mail name) as password.
INFO:aioftp.client:PASS [email protected]
INFO:aioftp.client:230 User logged in.
INFO:aioftp.client:TYPE I
INFO:aioftp.client:200 Type set to I.
INFO:aioftp.client:EPSV
INFO:aioftp.client:229 Entering Extended Passive Mode (|||37882|)
INFO:aioftp.client:MLSD /
INFO:aioftp.client:500 'MLSD /': command not understood.
INFO:aioftp.client:TYPE I
INFO:aioftp.client:200 Type set to I.
INFO:aioftp.client:EPSV
INFO:aioftp.client:229 Entering Extended Passive Mode (|||37883|)
INFO:aioftp.client:LIST /
INFO:aioftp.client:125 Data connection already open; Transfer starting.
...

Then you can see (via extra logging or wireshark) there is actual data with files:

INFO:aioftp.client:125 Data connection already open; Transfer starting.
b'09-12-12  12:06PM       <DIR>          aspnet_client\r\n'
b'09-12-12  08:52AM       <DIR>          atsactivity\r\n'
b'01-29-08  08:08AM       <DIR>          ClosingCross\r\n'
b'01-29-08  08:08AM       <DIR>          Downloads\r\n'
b'01-29-08  08:08AM       <DIR>          ETFData\r\n'
b'01-29-08  08:08AM       <DIR>          MonthlyShareVolume\r\n'
b'01-29-08  08:08AM       <DIR>          OpeningCross\r\n'
b'06-30-10  01:29PM       <DIR>          OrderExecutionQuality\r\n'
b'06-30-10  01:29PM       <DIR>          OrderExecutionQualityBX\r\n'
b'11-30-10  02:44PM       <DIR>          OrderExecutionQualityPSX\r\n'
b'09-23-08  07:34PM       <DIR>          phlx\r\n'
b'09-12-12  09:22AM       <DIR>          SymbolDirectory\r\n'
b''
INFO:aioftp.client:226 Transfer complete.

But the problem is in parsing part. I'm not a fan of a LIST command since it has no strict format, it is for humans. This discussed a lot in the issues and each time this blows up I have an approvement that this command should not be used at all. If, and only if, @jw4js have time and energy to invest updates to LIST parsing routine, then this will be fixed. Since, historicaly, the idea behind aioftp was to not to use LIST at all. Sorry for that, but legacy bites.

@pohmelie
Copy link
Collaborator

I've just released version 0.14.0 so you have an option to force your own parsing routine. https://aioftp.readthedocs.io/client_api.html#aioftp.Client

@Andrei-Pozolotin
Copy link
Author

@pohmelie Nikita:

  1. thank you so much for the fix, it works (see below)

  2. may I suggest few other corrections to the project:

“modify”, “type”, “type”, “size”

should read

“modify”, “type”, “size”
        await client.download(source=file_src, destination=file_dst, write_into=True)
        assert info['modify'] == os.path.getmtime(file_dst) # TODO
  • please rename pohmelie -> Nikita_Melentev
    as they say: 更加尊重上级权威为孩子带来更好的业障 :-)
  1. sample code to verify the fix:
import re
import os
import time
import aioftp
import asyncio
import pathlib
from urllib.parse import urlparse
from typing import Tuple, Mapping
from datetime import datetime

this_dir = os.path.dirname(__file__)
temp_dir = f"{this_dir}/tempdir"


def ftp_std_stamp(stamp:str) -> str:
    "convert remote stamp into aioftp format"
    return datetime.strptime(stamp, "%m-%d-%y%I:%M%p").strftime("%Y%m%d%H%M%S")


def ftp_line_parser(list_line:bytes) -> Tuple[pathlib.Path, Mapping]:
    """ parse ftp list lines such as:
    b'12-30-19  03:00AM  <DIR>  regnms\r\n'
    b'12-30-19  02:03PM  694484 psxtraded.txt\r\n'
    """
    list_text = list_line.decode()
    term_list = re.split("\s+", list_text)
    assert len(term_list) == 5, f"no list_line: {list_line}"
    has_file = term_list[2].isdigit()
    line_modify = ftp_std_stamp(term_list[0] + term_list[1])
    line_type = "file" if has_file else "dir"
    line_size = int(term_list[2]) if has_file else 0
    line_path = pathlib.Path(term_list[3])
    line_info = dict(
        modify=line_modify,
        type=line_type,
        size=line_size,
    )
    return (line_path, line_info)


async def ftp_win_nt_list(remote_url:str, remote_path:str) -> None:
    "verify parse_list_line_custom"
    remote_bag = urlparse(remote_url)
    ftp_host = remote_bag.hostname
    ftp_port = remote_bag.port or aioftp.DEFAULT_PORT
    ftp_user = remote_bag.username or "anonymous"
    ftp_pass = remote_bag.password or "[email protected]"
    session = aioftp.ClientSession(
        host=ftp_host, port=ftp_port, user=ftp_user, password=ftp_pass,
        parse_list_line_custom=ftp_line_parser,
    )
    async with session as client:
        entry_list = await client.list(path=remote_path)
        assert len(entry_list) > 0
        for path, info in entry_list:
            print(path, info)
            await ftp_file_download(client, path, info)


async def ftp_file_download(client, path, info) -> None:
    if info['type'] == "file" and info['size'] <= 1024:
        print(f"ftp_file_download: {path}")
        file_src = path
        file_dst = f"{temp_dir}/{path}-{time.time()}"
        assert not os.path.exists(file_dst)
        await client.download(source=file_src, destination=file_dst, write_into=True)
        assert os.path.exists(file_dst)
        assert info['size'] == os.path.getsize(file_dst)
        # assert info['modify'] == os.path.getmtime(file_dst) # TODO


remote_url = "ftp://ftp.nasdaqtrader.com"
remote_path = "/SymbolDirectory"
asyncio.run(ftp_win_nt_list(remote_url, remote_path))

@pohmelie
Copy link
Collaborator

may I suggest few other corrections to the project

Feel free to make pull request. I fix the typo about double "type".

please use time.time standard utc float timestamp representation for info['modify']

Not sure if got you right, but all MLSx facts are strings. More to say, there is a pretty strict description about modify field and it is not an utc timestamp: https://tools.ietf.org/html/rfc3659#section-2.3

please synchronize file modification time stamp upon transfer, so the following works

This is good point. Not sure if it is a major issue (since no one use modification/creation file time at all), but I agreed with you. Feel free to make a PR.

please rename pohmelie -> Nikita_Melentev

This is irrelevant to aioftp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants