I'm writing a python application that uses OpenStack to provide students access to a limited number of virtual machines.
Students can place reservations, either now or in the future.
I need to limit the number of virtual machines scheduled at any time to X while still allowing students to reserve vms if slots/reservations are available.
Reservation objects look like the below (sqlalchemy). I would know the start time and the length of the reservation requested, at which point I need to go through existing reservations and see if there are too many reservations in the time period requested. The *_job fields are the names of APScheduler jobs.
class Reservation(Entity): student = ManyToOne('Student', required=True) class_id = ManyToOne('Class', required=True) image = ManyToOne('Image', required=True) # openstack image id filled in once the instance is started instance_id = Field(UnicodeText) # apscheduler jobs stop_instance_job = Field(UnicodeText) start_instance_job = Field(UnicodeText) warn_reservation_ending_job = Field(UnicodeText) check_instance_job = Field(UnicodeText)
Any pointers on where to look for examples of schedule algorithms or something like that? I'm not even clear what to search for...
preguntado el 29 de julio de 12 a las 02:07
You should look up Grid based Schedulers. Normally schedulers don't know the true execution time (or time of resource use) and complicated heuristics are used to guess how long a problem will take (see such heuristics on a grid scheduler at: PDF download Describing Scheduling on Grid basis). A simpler approach with a basic grid for representing workload over time will most likely meet your needs. Python doesn't have any awesome grid object libraries that I know of (I've implemented a few in C++ and Python before though and they're not too hard). You should look at the numpy package for the easier interpretation of multi-dimensional objects -- which can emulate or implement grids easily enough.
Msw mentioned Dijkstra's Banker's Algorithm which is a form of job scheduling -- however your problem cares about future state more than current state and you can accurately predict (know the true value of) task times. Thus a T(timesteps) by N (number of resources -- might be just 1) by M (max resource reservations) grid which you fill in as jobs are registered would suffice. Determining if a particular job can be scheduled in a particular timeslot is a O(task_length * M) checks on a subsection of the grid (start, stop)x(required_resources)x(1,M) for an empty slot.
Finding an adequate location for a particular job (picking the start time) is a more difficult task and would be achieved by a modified Dijkstra's algorithm, or from any standard scheduler (msw's comment is more helpful for this task than for a timeslot capability check). Note that a lot of the scheduler content online is specific to OS process scheduling which cares more about the type of operation (I/O or not) and penalties for taking longer than expected than about abstract resource use. So google searches for schedulers will oftentimes give you Linux scheduler implementations rather than techniques for arbitrary data. Try looking up Shortest job schedulers, which are oftentimes simpler and less reliant on OS tasks when explained.